Lambda calculus


Intro

From Chapter 5 of TAPL:

In the mid 1960s, Peter Landin observed that a complex programming language can be understood by formulating it as a tiny core caclulus capturing the language's essential mechanisms, together with a collection of convenient derived forms whose behaviour is understood by translating them into the core language.

Semantic styles

Semantics specifies how the terms are evaluated. The meaning of the terms in the language.

Operational semantics

Denotational semantics

Axiomatic semantics

Operational semantics

A tiny language with arithmetic operations along with ℕ and 𝔹 as types.

{- Syntax -}
-- terms
t = O         -- constant zero
  | succ t
  | pred t
  | iszero t

{- values -}
-- Evaluation rules

     t₁ → t₂
===================
 succ t₁ → succ t₂

     t₁ → t₂
===================
 pred t₁ → pred t₂

        t₁ → t₂
=======================
 iszero t₁ → iszero t₂

iszero O → true
iszero (succ nv) → false
pred O → O       -- side-stepping the issue here.
pred (succ nv) → nv

Small step style

Big step style

Stuck term

Notations

Curry-style and Church-style

Two styles in which simply typed lambda calculus can be formulated.

Curry-style Church-style
Define terms, define semantics as to how Define terms, identify well-typed terms
they behave, give a type system that with a type system, then give
reject some terms. semantics just to those terms.
Semantics is prior to typing. Typing is prior to semantics.
Behaviour of ill-typed terms may be Behaviour (ie, semantics) of ill-typed
defined to some extent. terms not even defined.
Historically used for implicitly typed Commonly used for explicitly typed
lambda calculi. presentations of lambda calculi.

Type erasure

erasure(x) = x
erasure(λx:T.t) = λx. erasure(t)
erasure(t₁ t₂) = erasure(t₁) erasure(t₂)

Type reconstruction

Type substitution (σ)
a finite mapping from type variables to types.

Multiple argument functions

By definition, abstraction takes only one argument. But currying can be done to get the effect of functions with as many arguments as needed.

de Bruijn indexing

Nameless lambda calculus.

Use numbers instead of names for variables.

If used, no need of α-conversion as there can't be name collisions if there are no names. :)

Terms with free variables

ie, open terms.

We don't know how far is the binder (ie, λ) for a free variable. We need a naming context.

For example, in λx. y x, we know that x is 0 but are not sure about y.

Use a naming context (which is basically an assignation of de-Bruijn indices to the free variables) and use this assignment consistently.

Consider a naming context like

Γ = x ↦ 4
  | y ↦ 3
  | z ↦ 2
  | a ↦ 1
  | b ↦ 0

Then the terms could be 'translated' to:

| Named     | Nameless |
|-----------+----------|
| x (y z)   | 4 (3 2)  |
| λw. y w   | 4 0      |
| λw. λa. x | 6        |

for each binder, we got to increment the de-Bruijn index of the free variables.

Free variables

But can't have free variables in this notation? Though that isn't much of a problem as any term with a free variable can be made into a closed term.

Like: λx. x y being made into λy. (λ.x x y).

Kinds of lambda calculus

Need and motivation for lambda calculus

Uses of lambda calculus

It's so simple (just 3 rules for untyped lambda calculus!), yet it can express any computable function.

Church-Turing hypothesis

Lambda calculus with β-reduction is as powerful as a Turing machine.

It can express any computable function.

Untyped lambda calculus (utlc)

Syntax

term = x      -- variable
     | λx. t  -- abstraction
     | t₁ t₂  -- application

Substitution operation

We use an auxiliary operation called shifting for help in doing substitution.

The name of the bound variables are irrelevant as long as the naming is consistent.

For example, λx. x and λw. w are both same. The change in name doesn't make them different. Both are simply different forms of the identity function.

Shifting

To handle stuff like

[1 ↦ s](λ.2)

because as we are doing the substitution by going inside the body of the lambda (ie, the abstraction), the context becomes longer by one term (the variable over which the abstraction is done is added to the context).

To ensure that the de-Bruijn indices still refer to the proper lambda terms, we would need to increment ('shift up') the all de-Bruijn indices of all free variables inside the body of the abstraction.

Note that we can't just shift all variables just like that. Only free variables (ie, the variables which are bound somewhere outside the abstraction being examined) need to be shifted.

Bound variables should not be shifted.

shift(d, c) k       = k,     if k < c
                    | k + d, if k ≥ c

shift(d, c) (λ.t)   = λ. shift(d, c+1) t
shift(d, c) (t₁ t₂) = (shift(d, c) t₁) (shift(d, c) t₂)

-- if c is not mentioned, it is assumed to be 0

:EX-6.2.2:

-- Exercise: 6.2.2
-- Part 1
shift(2) (λ.λ. 1 (0 2))
= λ. shift(2, 1) (λ. 1 (0 2))
= λ. λ. shift(2, 2) (1 (0 2))
= λ. λ. 1 (0 4)

-- Part 2
shift(2) (λ. 0 1 (λ. 0 1 2))
= λ. shift(2,1) (0 1 (λ. 0 1 2))
= λ. 0 3 (λ. shift(2,2) (0 1 2))
= λ. 0 3 (λ. 0 1 4)

Shifting 'moves'.

Capture avoiding substitution

A substitution operation that makes sure that the bound variable names of terms are kept distinct from the free variable names.

This is the kind of substitution that we usually mean when we just say substitution.

Free variables

FV(x)     = {x}              -- variable
FV(λx.t)  = FV(t) - {x}      -- abstraction
FV(t₁ t₂) = FV(t₁) ∪ FV(t₂)  -- application

Recursion

Fixpoint

Effectively just recursion.

fix =
λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))

Reference

Like ref 3 in sml.

α-conversion

λx. x h can be α-reduced to λy. y h

Free variables are left untouched. Only the bound variable is renamed.

[x ↦ y z](λy. x y)

is problematic as the condition

y ∉ FV(y z)

is not met.

But if the term can be α-reduced like

[x ↦ y z](λk. x k)

there is variable capture.

β-reduction

Replace the occurrences of bound variable of the abstraction with the argument (on an application) in the body of the abstraction.

(λ.x x) y → [x ↦ y](λ.x x) → y

Strictness

Redex

A term of the form (λx. t₁) t.

Types of β-reduction

Full β-reduction

Normal order

id(id(λz. id z))
----------------
 ↓

id(λz. id z)
------------
 ↓

λz. id z
    ----
 ↓
λz. z

Call by name

id(id(λz. id z))
----------------
 ↓

id(λz. id z)
------------
 ↓

λz. id z

Call by value

id(id(λz. id z))
  --------------
 ↓

id(λz. id z)
------------
 ↓

λz. id z

β-reduction styles

:DBT: Signficance of the term 'call' here. There are no 'function call', isn't it? Just rewrites?

DONE Full β reduction

Any redux can be reduced at any time.

Eg:

id (id (λz. id z))
    --------------

id (λz. id z)
        -----

id (λz. z)
----------

λz. z

Reduction stops.


Another possiblity:


id (id (λz. id z))
            ----  

id (id (λz. z))
---------------

id (λz. z)
----------

λz. z

Reduction stops.

Call by need

DONE Call by name

Eg:

id (id (λz. id z))
------------------

id (λz. id z)
-------------

(λz. id z)

Reduction stops.
Even though the 'id z' inside the last abstraction could also be reduced.

DONE Call by value (??)

Eg:

id (id (λz. id z))
   ---------------

id (λz. id z)
-------------

(λz. id z)

Reduction stops.
Even though the 'id z' inside the last abstraction could also be reduced.

DONE Normal order

Eg:

id (id (λz. id z))
------------------

id (λz. id z)
-------------

λz. id z
    ----

(λz. z)

Reduction stops.

η-reduction

References:

Substitution

Shifting

Fixed point combinator

fix f = f (fix f)

https://en.wikipedia.org/wiki/Fixed-point_combinator

System F

Examples

map function

-- assumptions: types of available functions
nil : ∀X. List X
cons : ∀X. X -> List X
isnil : ∀X. List X -> Bool
head : ∀X. List X -> X
tail : ∀X. List X -> List X  (* everything except first element)

-- map function
λX. λY. λf:(X -> Y).
  (fix (λm: (List X) -> (List Y).  -- recursive function
    λl: List X.                    -- input list
      if isnil [X] l               -- from where did 'if' come from??
        then nil [Y]
      else
        cons [Y] (f (head [X] l))
    (m (tail [X] l))))

reverse function

append: ∀X. (List X) -> (List X) -> (List X)
append =
λX. (fix (λm: (List X) -> (List X) -> (List X).
  λa: (List X). λb: (List X).
  if isnil [X] a
  then b
  else
    cons [X] (head [X] a) (m (tail [X] a) y)))

reverse: ∀X. (List X) -> (List X)
reverse = 
λX. (fix λm: (List X) -> (List X).
  λa: (List  X).
  if isnil [X] a
  then nil [X]
  else
    append [X] (m (tail [X] a)) (head [X] a))

Church encodings

System Fω

System Fω<:

New terms

Normalization

Evaluation of a well-typed term is guaranteed to halt in a finite number of steps. ie, every well-typed term is normalizable. (src: tapl)

Normal form

Not every term has a normal form. Terms without a normal form are said to diverge (Divergent term).

Par example, the term

(λx. x x) (λx. x x).

Any attempt to reduce this term will resultant in the original term itself.

Impredicativity

A definition (of a set/type etc) is impredicative if it involves a quantifier whose domain includes the very thing being defined. (src: tapl C23)

Monotypes

Type variables that range only over quantifier-free types. (src: tapl C23)

Polytype or Type scheme

Quantified types.

Subtypes

Type preservation

 Γ ⊢ t : T         t → t'
==========================
    Γ ⊢ t' : T

Progress

TAPL exercises

5.2.1

See tag EX-5.2.1

6.1.1

Named term (question) de-Bruijn index term (answer)
λs. λz. z; λ.λ. 0;
λs. λz. s (s z); λ.λ. 1 (1 0);
λm. λn. λs. λz. m s (n z s); λ.λ.λ.λ. 3 1 (2 0 1);
λf. (λx. f (λy. (x x) y)) (λx. f (λy. (x x) y)); λ. (λ. 1 (λ. (1 1) 0)) (λ. 1 (λ. (1 1) 0));
(λx. (λx. x)) (λx. x); (λ. (λ. 0)) (λ. 0);

6.2.2

See tag EX-6.2.2

13.1.1

Question

Draw a similar diagram showing the effects of evaluating the expressions a = {ref 0, ref 0} and b = (λx: Ref Nat. {x, x}) (ref 0).

Answer

a = {ref 0, ref 0}


a = { _, _}
      |  |
      ↓  ↓
    +---+---+
    |   |   |
    +---+---+

where the two elements of the record point to different memory locations.

And

b = (λx: Ref Nat. {x, x}) (ref 0)


b = { _, _}
      |  |
      ↓  ↓
      ↘  ↙
        ↓ 
      +---+
      |   |
      +---+

where both elements of the record point to the same memory location.

15.2.1

Question

Draw a derivation showing that {x:Nat, y:Nat, z:Nat} is a subtype of {y:Nat}.

Answer

{x:Nat, y:Nat, z:Nat} <: {x:Nat, y:Nat, z:Nat} (S-Refl)
{x:Nat, y:Nat, z:Nat} <: {x:Nat, y:Nat} (S-RcdWidth, k=1)
{x:Nat, y:Nat, z:Nat} <: {y:Nat, x:Nat} (S-RcdPerm)
{x:Nat, y:Nat, z:Nat} <: {y:Nat} (S-RcdWidth, k=1)

External References

Using this definition, we can define identity function (λx:T. T):

idfn = Abs A A (λx:Term A. x)

where A is a type variable.

Logical frameworks

Systems that provide mechanisms for representing the syntax and proof systems which make up a logic.

Eg: HOAS

See: https://ncatlab.org/nlab/show/logical+framework

Barendregt cube

    Fω        CC
    +--------+ 
   /|       /|
F / |      / |
 +--------+  |
 |  |     |  |
 |  +-----|--+
 | /λω    | /
 |/       |/
 +--------+ 
λ→         LF

System-F: polymorphic lambda calculus

Interesting bits