DEPTH FIRST

Function Notation : Evaluation

Adam Henderson

June 1, 2024

“What is the most efficient notation?” is a recurring question when I read math, machine learning, or other technical publications. Which is followed by the debate of working with the default syntax of the text - or mapping to my own preferred syntax.

For example : Do we convert expressions in “conventional matrix notation” \(A^TBv\) into abstract index notation \(A_{ba} B_{bc} v_c\)?

Given the repetition of the question and redundant efforts I want to collect observations on syntax to streamline this debate. I’m also inspired by deeper investigations into syntax from likes of Djikstra (On Notation and Adopted Notation), Knuth, and Iverson.

I’ll start with the most recent obsession of syntax for functions. There are a few types of function notation that frequently occur :

To keep this short I’ll focus further on function evaluation.

Notations for Function Evaluation :

Function evaluation, or applying a function to an input, can be expressed in various ways, each with specific advantages and context-dependent appropriateness.

Univariate

Why So Many?

Why can’t we just pick one and standardize? The “best” notation depends on context. The key properties to balance are :

Reading Time as driven by ambiguity of parsing, ease of parsing, reliance on context and backtracking, amount of redundancy,consistency with notation in related domains, and character count. These are in order of importance (to me) with character count being dangerous to directly optimize for and unambiguous parsing being table stakes.

Ease of Manipulation : Does the syntax make it easier to perform common transformations / calculations?

Multivariate

Equivalent Spaces of Functions but Different Emphasis

The different notations for multivariate functions emphasize different isomorphic spaces of functions. The set of functions \(f: X \times Y \to Z\) is equivalent to the set of functions \(f: X \to (Y \to Z)\) or \(f: Y \to (X \to Z)\). Each is tied to a family of notations for multivariate functions.

Mixed syntax behaves similar to currying

The variety of syntax is valuable to emphasizing different isomorphic but practically different representations.

Positional vs Named Inputs

This distinction cuts across all the notation families above — it is a property of how arguments are bound, independent of the evaluation syntax chosen.

Positional: \(f(2, 3, \text{"Fred"})\) — inputs provided in the order the function expects. Concise, but the meaning of each position must be remembered or recovered from context.

Named: \(f(x=2, y=3, \text{cat}=\text{"fred"})\) — each argument explicitly labeled. Self-documenting, at the cost of verbosity.

Einstein notation is the clearest example of named binding in mathematical notation — index labels are argument names, and reordering is free because the contraction is determined by label matching, not position. Index height additionally encodes which slot of the dual pairing is being filled, a distinction positional notation drops entirely.

Special Cases

There are functions that occur so commonly in their associated domain they get special compact syntax. For example there is a wide variety of two arguments functions which use bracket syntax without a function name * Norms : \(\vert x\vert\), \(\vert \vert x \vert \vert\) * Brackets : Commutator \([x,y]\), Poisson \(\\{x,y\\}\) * Inner products : \((x, y)\), \(\langle x \vert y \rangle\).

Infix notation (\(x+y\), \(f \circ g\)) is especially valuable for associative binary operations where \(+(x, +(y, +(z, w))))\) is awful, but \(x + y + z + w\) is easy to read.

There is similarly a large family of unary functions that occur commonly enough to show up as little “decorations” on the arguments (\(\bar{x}\), \(x^*\), \(x^{\dagger}\)). These are most common in “involutions” where are their own inverse, so that painful expression like \(x^{\dagger \dagger \dagger}\) don’t occur.

My Current Preference

Adopted: a mixed use of parentheses \(f(x)\) and subscript/superscript.

Avoiding: juxtaposition, and mixing \(f(x)\) with \(g[x]\) in the same context.

The variety of notation is a feature — each choice encodes a different emphasis on the underlying function space, and the right choice depends on what structure you are trying to make visible.