Lambda calculus

Intro

From Chapter 5 of TAPL:

In the mid 1960s, Peter Landin observed that a complex programming language can be understood by formulating it as a tiny core caclulus capturing the language's essential mechanisms, together with a collection of convenient derived forms whose behaviour is understood by translating them into the core language.

Landin chose lambda calculus as core language.
All computations reduced to a few basic operations: function definition and function application

Semantic styles

Semantics specifies how the terms are evaluated. The meaning of the terms in the language.

Operational semantics

Specifies behaviour of a language by specifying a simple abstract machine for it.
This machine uses the terms of the language as instructions.
- Simple language ⇒ state of machine is a term and transition function performs a simplification to proceed to next state.
- Meaning of the term is the final state of the machine (when the term was the input).
Sometimes useful to have multiple operational semantics for the same language. Some closer to the way machines see it and some closer to the way that humans see it. Then we can sort of chain proofs and prove correctness of the language. (I suppose this is how CompCert did it).

Denotational semantics

Meaning of a term is taken as a mathematical object (like a number or function).
Denotational semantics means finding semantic domains and defining an interpreter function mapping the terms of the language to elements in these domains.
Highlights the essential properties of the language.
Domain theory is a field that came out of the search for appropriate semantic domains for modeling various language features.
Thoughtful selection of domains can let us derive powerful properties for reasoning about programs.
Can't handle non-determinism and concurrency??

Axiomatic semantics

Instead of assigning behaviour and then deriving laws and properties them, define the laws themselves first and then derive the behaviour.
The laws themselves are the definition. Behaviour is whatever we can get from those laws.
Meaning of a term is what can be proved about it.
Focuses on the process of reasoning about programs.
Procedures can't be handled properly??

Operational semantics

A tiny language with arithmetic operations along with ℕ and 𝔹 as types.

{- Syntax -}
-- terms
t = O         -- constant zero
  | succ t
  | pred t
  | iszero t

{- values -}
-- Evaluation rules

     t₁ → t₂
===================
 succ t₁ → succ t₂

     t₁ → t₂
===================
 pred t₁ → pred t₂

        t₁ → t₂
=======================
 iszero t₁ → iszero t₂

iszero O → true
iszero (succ nv) → false
pred O → O       -- side-stepping the issue here.
pred (succ nv) → nv

Small step style

Evaluation may be done in multiple steps.
Computation progresses through individual steps guided by the evaluation relation till the term becomes a value.

Big step style

aka Natural semantics
Do everything in one step.
Directly specifies that 'this term evaluates to this value'.
t ⇓ v means that the term t evaluates to the value v.

Stuck term

A term for which there is no further step in the evaluation relation but is still not a value.
ie, the term is in normal form but is not a value.

Notations

Open term: Term with at least one free variable.
Closed term: Term with no free variables. All variables are bound.
t → t' means that t evaluates to t' in a single step from the evaluation relation.
Value: A 'term' that has finished computing and cannot be reduced further.

Curry-style and Church-style

Two styles in which simply typed lambda calculus can be formulated.

Curry-style	Church-style
Define terms, define semantics as to how	Define terms, identify well-typed terms
they behave, give a type system that	with a type system, then give
reject some terms.	semantics just to those terms.
Semantics is prior to typing.	Typing is prior to semantics.
Behaviour of ill-typed terms may be	Behaviour (ie, semantics) of ill-typed
defined to some extent.	terms not even defined.
Historically used for implicitly typed	Commonly used for explicitly typed
lambda calculi.	presentations of lambda calculi.

Type erasure

Removing types. Effectively a conversion to a form of untyped lambda calculus.
Many compilers don't carry along type annotations for run-time.
- Type checking is done at compile time after which the typing info is no longer needed and may be 'erased'.

erasure(x) = x
erasure(λx:T.t) = λx. erasure(t)
erasure(t₁ t₂) = erasure(t₁) erasure(t₂)

t₁ → t₂, then erasure(t₁) → erasure(t₂)
erasure(t) → m', then ∃t'. t → t' and erasure(t') = m

Type reconstruction

Type substitution (σ): a finite mapping from type variables to types.

Multiple argument functions

By definition, abstraction takes only one argument. But currying can be done to get the effect of functions with as many arguments as needed.

de Bruijn indexing

Nameless lambda calculus.

Use numbers instead of names for variables.

If used, no need of α-conversion as there can't be name collisions if there are no names. :)

Terms with free variables

ie, open terms.

We don't know how far is the binder (ie, λ) for a free variable. We need a naming context.

For example, in λx. y x, we know that x is 0 but are not sure about y.

Use a naming context (which is basically an assignation of de-Bruijn indices to the free variables) and use this assignment consistently.

Consider a naming context like

Γ = x ↦ 4
  | y ↦ 3
  | z ↦ 2
  | a ↦ 1
  | b ↦ 0

Then the terms could be 'translated' to:

| Named     | Nameless |
|-----------+----------|
| x (y z)   | 4 (3 2)  |
| λw. y w   | 4 0      |
| λw. λa. x | 6        |

for each binder, we got to increment the de-Bruijn index of the free variables.

Free variables

But can't have free variables in this notation? Though that isn't much of a problem as any term with a free variable can be made into a closed term.

Like: λx. x y being made into λy. (λ.x x y).

Kinds of lambda calculus

Untyped lambda calculus
Typed lambda calculus
Pure typed/untyped lambda calculus
Simply typed lambda calculus

Need and motivation for lambda calculus

Can be viewed simultaneously as a programming language in which computations can be described and as a mathematical object with which we can reason about.
Good tool for analyzing programming languages
Simple syntax to represent computable functions (ie, computations).

Uses of lambda calculus

It's so simple (just 3 rules for untyped lambda calculus!), yet it can express any computable function.

Church-Turing hypothesis

Lambda calculus with β-reduction is as powerful as a Turing machine.

It can express any computable function.

Untyped lambda calculus (utlc)

Syntax

term = x      -- variable
     | λx. t  -- abstraction
     | t₁ t₂  -- application

Substitution operation

We use an auxiliary operation called shifting for help in doing substitution.

The name of the bound variables are irrelevant as long as the naming is consistent.

For example, λx. x and λw. w are both same. The change in name doesn't make them different. Both are simply different forms of the identity function.

Shifting

Remembers the de-Bruijn indices of free variables in a term.

To handle stuff like

[1 ↦ s](λ.2)

because as we are doing the substitution by going inside the body of the lambda (ie, the abstraction), the context becomes longer by one term (the variable over which the abstraction is done is added to the context).

To ensure that the de-Bruijn indices still refer to the proper lambda terms, we would need to increment ('shift up') the all de-Bruijn indices of all free variables inside the body of the abstraction.

Note that we can't just shift all variables just like that. Only free variables (ie, the variables which are bound somewhere outside the abstraction being examined) need to be shifted.

Bound variables should not be shifted.

shift(d, c) k       = k,     if k < c
                    | k + d, if k ≥ c

shift(d, c) (λ.t)   = λ. shift(d, c+1) t
shift(d, c) (t₁ t₂) = (shift(d, c) t₁) (shift(d, c) t₂)

-- if c is not mentioned, it is assumed to be 0

:EX-6.2.2:

-- Exercise: 6.2.2
-- Part 1
shift(2) (λ.λ. 1 (0 2))
= λ. shift(2, 1) (λ. 1 (0 2))
= λ. λ. shift(2, 2) (1 (0 2))
= λ. λ. 1 (0 4)

-- Part 2
shift(2) (λ. 0 1 (λ. 0 1 2))
= λ. shift(2,1) (0 1 (λ. 0 1 2))
= λ. 0 3 (λ. shift(2,2) (0 1 2))
= λ. 0 3 (λ. 0 1 4)

Shifting 'moves'.

Capture avoiding substitution

A substitution operation that makes sure that the bound variable names of terms are kept distinct from the free variable names.

This is the kind of substitution that we usually mean when we just say substitution.

Free variables

FV(x)     = {x}              -- variable
FV(λx.t)  = FV(t) - {x}      -- abstraction
FV(t₁ t₂) = FV(t₁) ∪ FV(t₂)  -- application

Recursion

Fixpoint

Effectively just recursion.

fix =
λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))

Reference

Like ref 3 in sml.

Impure
Useful for implicit communication via shared state.
Allow some side effects.
Significantly improve practical ease of use.
Aliasing, which offers its advantages, also makes reasoning about difficult.

α-conversion

aka α-renaming.
Renaming bound variables.
To avoid name collisions.
Avoid free variable capture.

λx. x h can be α-reduced to λy. y h

Free variables are left untouched. Only the bound variable is renamed.

[x ↦ y z](λy. x y)

is problematic as the condition

y ∉ FV(y z)

is not met.

But if the term can be α-reduced like

[x ↦ y z](λk. x k)

there is variable capture.

β-reduction

Replace the occurrences of bound variable of the abstraction with the argument (on an application) in the body of the abstraction.

(λ.x x) y → [x ↦ y](λ.x x) → y

Strictness

strict
lazy

Redex

A term of the form (λx. t₁) t.

Reducible expression.
Term coined by Alonzo Church.
β-reduction is the process of rewriting a redex as per evaluation rules.

Types of β-reduction

Full β-reduction

Any redex can be reduced at any time.
Each step picks a redex anywhere inside the term being evaluated and reduces it.

Normal order

Left-most, outer-most redex is reduced first.

id(id(λz. id z))
----------------
 ↓

id(λz. id z)
------------
 ↓

λz. id z
    ----
 ↓
λz. z

Call by name

Similar to normal order but no reductions inside abstractions. (names preserved??)
A lazy strategy. Evaluation happens only if the term is actually used.
Haskell uses call by need, an optimized form of call by name form of β-reduction.

id(id(λz. id z))
----------------
 ↓

id(λz. id z)
------------
 ↓

λz. id z

Call by value

Only outermost redexes are reduced (TODO: what does that mean??)
Redex is reduces only when its right side has been reduced to a value.
A strict strategy. Evaluation happens anyway, even if it not needed and the term is not being used.
Used by most languages.

id(id(λz. id z))
  --------------
 ↓

id(λz. id z)
------------
 ↓

λz. id z

β-reduction styles

evaluation strategies.
Choice of this style doesn't really matter as far as evaluation itself is concerned.
Used by TAPL book.

:DBT: Signficance of the term 'call' here. There are no 'function call', isn't it? Just rewrites?

DONE Full β reduction

Any redux can be reduced at any time.

Eg:

id (id (λz. id z))
    --------------

id (λz. id z)
        -----

id (λz. z)
----------

λz. z

Reduction stops.


Another possiblity:


id (id (λz. id z))
            ----  

id (id (λz. z))
---------------

id (λz. z)
----------

λz. z

Reduction stops.

Call by need

Lazy. Arguments evaluated only when needed
Used by Haskell (via rewrites)

DONE Call by name

No reduction done inside an abstraction.
Reduction stops if a lambda is encountered.
Lazy. Arguments evaluated only when needed

Eg:

id (id (λz. id z))
------------------

id (λz. id z)
-------------

(λz. id z)

Reduction stops.
Even though the 'id z' inside the last abstraction could also be reduced.

DONE Call by value (??)

Redex reduced only when the right operand is already in normal form (ie, a value)
Only outermost redexes reduced.
'Functions are called only on values' ²
Strict. Arguments always evaluated. Whether needed or not.
Used by most languages

Eg:

id (id (λz. id z))
   ---------------

id (λz. id z)
-------------

(λz. id z)

Reduction stops.
Even though the 'id z' inside the last abstraction could also be reduced.

DONE Normal order

leftmost, outermost redex is reduced first
redexes inside abstractions also reduced

Eg:

id (id (λz. id z))
------------------

id (λz. id z)
-------------

λz. id z
    ----

(λz. z)

Reduction stops.

η-reduction

Replace λx.f x with f if x doesn't appear as a free variable in f.

References:

Substitution

Shifting

Fixed point combinator

Way for recursion.
Combinators are closed lambda expressions (ie, no free variables).

fix f = f (fix f)

https://en.wikipedia.org/wiki/Fixed-point_combinator

System F

An extension of simply typed lambda calculus
Abstracts out varying types. Universal types.
A form of parametric polymorphism.
Well-typed terms in System F are normalizing.
- ie, evaluation of all terms terminates.
- For every term t, t is either in normalized form (ie, cannot be reduced further) or there is a typing rule t → t' that can reduce it (progress).
Type inference for System-F is undecidable. (Why??)
First discovered by the logician Jean-Yves Girard (1972) in the context of proof theory
Then independently by the computer scientist John Reynolds (1974)
- Reynolds named it Polymorphic lambda calculus.
aka second-order lambda calculus
- Corresponds to second-order intuitionistic logic via Curry-Howard correspondence.
- Allows quantification over types (predicates) in addition to terms.
Has been extensively used as a research vehicle to study polymorphism and as the basis of many programming language designs.

Examples

map function

-- assumptions: types of available functions
nil : ∀X. List X
cons : ∀X. X -> List X
isnil : ∀X. List X -> Bool
head : ∀X. List X -> X
tail : ∀X. List X -> List X  (* everything except first element)

-- map function
λX. λY. λf:(X -> Y).
  (fix (λm: (List X) -> (List Y).  -- recursive function
    λl: List X.                    -- input list
      if isnil [X] l               -- from where did 'if' come from??
        then nil [Y]
      else
        cons [Y] (f (head [X] l))
    (m (tail [X] l))))

reverse function

append: ∀X. (List X) -> (List X) -> (List X)
append =
λX. (fix (λm: (List X) -> (List X) -> (List X).
  λa: (List X). λb: (List X).
  if isnil [X] a
  then b
  else
    cons [X] (head [X] a) (m (tail [X] a) y)))

reverse: ∀X. (List X) -> (List X)
reverse = 
λX. (fix λm: (List X) -> (List X).
  λa: (List  X).
  if isnil [X] a
  then nil [X]
  else
    append [X] (m (tail [X] a)) (head [X] a))

Church encodings

In untyped lambda calculus.
Primitive data values like bool, nat, list, etc can be encoded.
Encoded as functions.

System Fω

Doesn't allow dependent types (which are mappings from values to types).

System Fω<:

System Fω with subtyping.

New terms

Universal lambda functions?? link
Point-free programming
Type reconstruction (tapl C22)
Type erasure (tapl C9?)

Normalization

Evaluation of a well-typed term is guaranteed to halt in a finite number of steps. ie, every well-typed term is normalizable. (src: tapl)

TODO: Strong normalization

Normal form

a term that cannot be further evaluated.
cannot take another step under the evaluation relation.
already a value.

Not every term has a normal form. Terms without a normal form are said to diverge (Divergent term).

Par example, the term

(λx. x x) (λx. x x).

Any attempt to reduce this term will resultant in the original term itself.

Impredicativity

A definition (of a set/type etc) is impredicative if it involves a quantifier whose domain includes the very thing being defined. (src: tapl C23)

Monotypes

Type variables that range only over quantifier-free types. (src: tapl C23)

Polytype or Type scheme

Quantified types.

Subtypes

aka second order lambda calculus (via an analogy with second-order intuitionalistic logic)
- Quantification over types (analogous to predicates) possible.
- In addition to quantification over terms
A form of polymorphism (subtype polymorphism).
Discovered in 1972 by Jean-Yves Girard
A single term is given many types. (using rule of subsumption?)
sml doesn't have subtyping.

Type preservation

Meaning is not lost.
t → t' means t reduces to t'

 Γ ⊢ t : T         t → t'
==========================
    Γ ⊢ t' : T

Progress

Related to normalizability.
For every term in the language:the term is either in normalized form (cannot be reduced further. Is already a value), or there is a typing rule (t → t') that takes it a step closer to normalized form.

External References

Types and Programming Languages by Benjamin Pierce
Advanced topics in Types and Programming Languages by Benjamin Pierce
Link
Lambda-calculus and combinators, an introduction by Hindley and Seldin
An introduction to the lambda calculus for computer scientists by Hankin (very basic)
Lectures on the Curry-Howard isomorphism by Sorensen and Urzyczyn (advanced)
Basic simple type theory by Hindley (advanced)
The lambda calculus: its syntax and semantics by Barendregt (authoritative)
http://www.paultaylor.eu/stable/Proofs+Types.html
https://www.irif.fr/~krivine//articles/Lambda.pdf
https://mathoverflow.net/questions/69337/what-is-some-good-introduction-to-lambda-calculus

Using this definition, we can define identity function (λx:T. T):

idfn = Abs A A (λx:Term A. x)

where A is a type variable.

Logical frameworks

Systems that provide mechanisms for representing the syntax and proof systems which make up a logic.

Eg: HOAS

See: https://ncatlab.org/nlab/show/logical+framework

Barendregt cube

Also known as lambda cube. ¹
See https://en.wikipedia.org/wiki/Lambda_cube

    Fω        CC
    +--------+ 
   /|       /|
F / |      / |
 +--------+  |
 |  |     |  |
 |  +-----|--+
 | /λω    | /
 |/       |/
 +--------+ 
λ→         LF

Top face: systems with polymorphism (ie, have term families indexed by types).
Back face: systems with type operators.
Right face: systems with dependent types

λ→: simply typed lambda calculus
CC: Calculus of constructions
Fω: System-Fω
LF: Logical framework
- aka Lambda-P

System-F: polymorphic lambda calculus

aka λ2

Interesting bits

https://en.wikipedia.org/wiki/Iota_and_Jot