Intro
From Chapter 5 of TAPL:
In the mid 1960s, Peter Landin observed that a complex programming language can be understood by formulating it as a tiny core caclulus capturing the language's essential mechanisms, together with a collection of convenient derived forms whose behaviour is understood by translating them into the core language.
Landin chose lambda calculus as core language.
All computations reduced to a few basic operations: function definition and function application
Semantic styles
Semantics specifies how the terms are evaluated. The meaning of the terms in the language.
Operational semantics
- Specifies behaviour of a language by specifying a simple abstract machine for it.
- This machine uses the terms of the language as instructions.
- Simple language ⇒ state of machine is a term and transition function performs a simplification to proceed to next state.
- Meaning of the term is the final state of the machine (when the term was the input).
- Sometimes useful to have multiple operational semantics for the same language. Some closer to the way machines see it and some closer to the way that humans see it. Then we can sort of chain proofs and prove correctness of the language. (I suppose this is how CompCert did it).
Denotational semantics
- Meaning of a term is taken as a mathematical object (like a number or function).
- Denotational semantics means finding semantic domains and defining an interpreter function mapping the terms of the language to elements in these domains.
- Highlights the essential properties of the language.
- Domain theory is a field that came out of the search for appropriate semantic domains for modeling various language features.
- Thoughtful selection of domains can let us derive powerful properties for reasoning about programs.
- Can't handle non-determinism and concurrency??
Axiomatic semantics
- Instead of assigning behaviour and then deriving laws and properties them, define the laws themselves first and then derive the behaviour.
- The laws themselves are the definition. Behaviour is whatever we can get from those laws.
- Meaning of a term is what can be proved about it.
- Focuses on the process of reasoning about programs.
- Procedures can't be handled properly??
Operational semantics
A tiny language with arithmetic operations along with ℕ and 𝔹 as types.
{- Syntax -}
-- terms
t = O -- constant zero
| succ t
| pred t
| iszero t
{- values -}
-- Evaluation rules
t₁ → t₂
===================
succ t₁ → succ t₂
t₁ → t₂
===================
pred t₁ → pred t₂
t₁ → t₂
=======================
iszero t₁ → iszero t₂
iszero O → true
iszero (succ nv) → false
pred O → O -- side-stepping the issue here.
pred (succ nv) → nv
Small step style
- Evaluation may be done in multiple steps.
- Computation progresses through individual steps guided by the evaluation relation till the term becomes a value.
Big step style
- aka Natural semantics
- Do everything in one step.
- Directly specifies that 'this term evaluates to this value'.
t ⇓ v
means that the termt
evaluates to the valuev
.
Stuck term
- A term for which there is no further step in the evaluation relation but is still not a value.
- ie, the term is in normal form but is not a value.
Notations
- Open term: Term with at least one free variable.
- Closed term: Term with no free variables. All variables are bound.
t → t'
means thatt
evaluates tot'
in a single step from the evaluation relation.- Value: A 'term' that has finished computing and cannot be reduced further.
Curry-style and Church-style
Two styles in which simply typed lambda calculus can be formulated.
Curry-style | Church-style |
---|---|
Define terms, define semantics as to how | Define terms, identify well-typed terms |
they behave, give a type system that | with a type system, then give |
reject some terms. | semantics just to those terms. |
Semantics is prior to typing. | Typing is prior to semantics. |
Behaviour of ill-typed terms may be | Behaviour (ie, semantics) of ill-typed |
defined to some extent. | terms not even defined. |
Historically used for implicitly typed | Commonly used for explicitly typed |
lambda calculi. | presentations of lambda calculi. |
Type erasure
- Removing types. Effectively a conversion to a form of untyped lambda calculus.
- Many compilers don't carry along type annotations for run-time.
- Type checking is done at compile time after which the typing info is no longer needed and may be 'erased'.
erasure(x) = x
erasure(λx:T.t) = λx. erasure(t)
erasure(t₁ t₂) = erasure(t₁) erasure(t₂)
t₁ → t₂
, thenerasure(t₁) → erasure(t₂)
erasure(t) → m'
, then∃t'. t → t'
anderasure(t') = m
Type reconstruction
- Type substitution (σ)
- a finite mapping from type variables to types.
Multiple argument functions
By definition, abstraction takes only one argument. But currying can be done to get the effect of functions with as many arguments as needed.
de Bruijn indexing
Nameless lambda calculus.
Use numbers instead of names for variables.
If used, no need of α-conversion as there can't be name collisions if there are no names. :)
Terms with free variables
ie, open terms.
We don't know how far is the binder (ie, λ) for a free variable. We need a naming context.
For example, in λx. y x
, we know that x
is 0
but are not sure about y
.
Use a naming context (which is basically an assignation of de-Bruijn indices to the free variables) and use this assignment consistently.
Consider a naming context like
Γ = x ↦ 4
| y ↦ 3
| z ↦ 2
| a ↦ 1
| b ↦ 0
Then the terms could be 'translated' to:
| Named | Nameless |
|-----------+----------|
| x (y z) | 4 (3 2) |
| λw. y w | 4 0 |
| λw. λa. x | 6 |
for each binder, we got to increment the de-Bruijn index of the free variables.
Free variables
But can't have free variables in this notation? Though that isn't much of a problem as any term with a free variable can be made into a closed term.
Like: λx. x y
being made into λy. (λ.x x y)
.
Kinds of lambda calculus
- Untyped lambda calculus
- Typed lambda calculus
- Pure typed/untyped lambda calculus
- Simply typed lambda calculus
Need and motivation for lambda calculus
- Can be viewed simultaneously as a programming language in which computations can be described and as a mathematical object with which we can reason about.
- Good tool for analyzing programming languages
- Simple syntax to represent computable functions (ie, computations).
Uses of lambda calculus
It's so simple (just 3 rules for untyped lambda calculus!), yet it can express any computable function.
Church-Turing hypothesis
Lambda calculus with β-reduction is as powerful as a Turing machine.
It can express any computable function.
Untyped lambda calculus (utlc)
Syntax
term = x -- variable
| λx. t -- abstraction
| t₁ t₂ -- application
Substitution operation
We use an auxiliary operation called shifting for help in doing substitution.
The name of the bound variables are irrelevant as long as the naming is consistent.
For example, λx. x
and λw. w
are both same. The change in name doesn't make them different. Both are simply different forms of the identity function.
Shifting
- Remembers the de-Bruijn indices of free variables in a term.
To handle stuff like
[1 ↦ s](λ.2)
because as we are doing the substitution by going inside the body of the lambda (ie, the abstraction), the context becomes longer by one term (the variable over which the abstraction is done is added to the context).
To ensure that the de-Bruijn indices still refer to the proper lambda terms, we would need to increment ('shift up') the all de-Bruijn indices of all free variables inside the body of the abstraction.
Note that we can't just shift all variables just like that. Only free variables (ie, the variables which are bound somewhere outside the abstraction being examined) need to be shifted.
Bound variables should not be shifted.
shift(d, c) k = k, if k < c
| k + d, if k ≥ c
shift(d, c) (λ.t) = λ. shift(d, c+1) t
shift(d, c) (t₁ t₂) = (shift(d, c) t₁) (shift(d, c) t₂)
-- if c is not mentioned, it is assumed to be 0
:EX-6.2.2:
-- Exercise: 6.2.2
-- Part 1
shift(2) (λ.λ. 1 (0 2))
= λ. shift(2, 1) (λ. 1 (0 2))
= λ. λ. shift(2, 2) (1 (0 2))
= λ. λ. 1 (0 4)
-- Part 2
shift(2) (λ. 0 1 (λ. 0 1 2))
= λ. shift(2,1) (0 1 (λ. 0 1 2))
= λ. 0 3 (λ. shift(2,2) (0 1 2))
= λ. 0 3 (λ. 0 1 4)
Shifting 'moves'.
Capture avoiding substitution
A substitution operation that makes sure that the bound variable names of terms are kept distinct from the free variable names.
This is the kind of substitution that we usually mean when we just say substitution.
Free variables
FV(x) = {x} -- variable
FV(λx.t) = FV(t) - {x} -- abstraction
FV(t₁ t₂) = FV(t₁) ∪ FV(t₂) -- application
Recursion
Fixpoint
Effectively just recursion.
fix =
λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))
Reference
Like ref 3
in sml.
- Impure
- Useful for implicit communication via shared state.
- Allow some side effects.
- Significantly improve practical ease of use.
- Aliasing, which offers its advantages, also makes reasoning about difficult.
α-conversion
- aka α-renaming.
- Renaming bound variables.
- To avoid name collisions.
- Avoid free variable capture.
λx. x h
can be α-reduced to λy. y h
Free variables are left untouched. Only the bound variable is renamed.
[x ↦ y z](λy. x y)
is problematic as the condition
y ∉ FV(y z)
is not met.
But if the term can be α-reduced like
[x ↦ y z](λk. x k)
there is variable capture.
β-reduction
Replace the occurrences of bound variable of the abstraction with the argument (on an application) in the body of the abstraction.
(λ.x x) y → [x ↦ y](λ.x x) → y
Strictness
- strict
- lazy
Redex
A term of the form (λx. t₁) t
.
- Reducible expression.
- Term coined by Alonzo Church.
- β-reduction is the process of rewriting a redex as per evaluation rules.
Types of β-reduction
Full β-reduction
- Any redex can be reduced at any time.
- Each step picks a redex anywhere inside the term being evaluated and reduces it.
Normal order
- Left-most, outer-most redex is reduced first.
id(id(λz. id z))
----------------
↓
id(λz. id z)
------------
↓
λz. id z
----
↓
λz. z
Call by name
- Similar to normal order but no reductions inside abstractions. (names preserved??)
- A lazy strategy. Evaluation happens only if the term is actually used.
- Haskell uses call by need, an optimized form of call by name form of β-reduction.
id(id(λz. id z))
----------------
↓
id(λz. id z)
------------
↓
λz. id z
Call by value
- Only outermost redexes are reduced (TODO: what does that mean??)
- Redex is reduces only when its right side has been reduced to a value.
- A strict strategy. Evaluation happens anyway, even if it not needed and the term is not being used.
- Used by most languages.
id(id(λz. id z))
--------------
↓
id(λz. id z)
------------
↓
λz. id z
β-reduction styles
- evaluation strategies.
- Choice of this style doesn't really matter as far as evaluation itself is concerned.
- Used by TAPL book.
:DBT: Signficance of the term 'call' here. There are no 'function call', isn't it? Just rewrites?
DONE Full β reduction
Any redux can be reduced at any time.
Eg:
id (id (λz. id z))
--------------
id (λz. id z)
-----
id (λz. z)
----------
λz. z
Reduction stops.
Another possiblity:
id (id (λz. id z))
----
id (id (λz. z))
---------------
id (λz. z)
----------
λz. z
Reduction stops.
Call by need
- Lazy. Arguments evaluated only when needed
- Used by Haskell (via rewrites)
DONE Call by name
- No reduction done inside an abstraction.
- Reduction stops if a lambda is encountered.
- Lazy. Arguments evaluated only when needed
Eg:
id (id (λz. id z))
------------------
id (λz. id z)
-------------
(λz. id z)
Reduction stops.
Even though the 'id z' inside the last abstraction could also be reduced.
DONE Call by value (??)
- Redex reduced only when the right operand is already in normal form (ie, a value)
- Only outermost redexes reduced.
- 'Functions are called only on values' ²
- Strict. Arguments always evaluated. Whether needed or not.
- Used by most languages
Eg:
id (id (λz. id z))
---------------
id (λz. id z)
-------------
(λz. id z)
Reduction stops.
Even though the 'id z' inside the last abstraction could also be reduced.
DONE Normal order
- leftmost, outermost redex is reduced first
- redexes inside abstractions also reduced
Eg:
id (id (λz. id z))
------------------
id (λz. id z)
-------------
λz. id z
----
(λz. z)
Reduction stops.
η-reduction
- Replace
λx.f x
withf
ifx
doesn't appear as a free variable inf
.
References:
- https://wiki.haskell.org/Eta_conversion
- https://en.wikipedia.org/wiki/Lambda_calculus#%CE%B7-reduction
Substitution
Shifting
Fixed point combinator
- Way for recursion.
- Combinators are closed lambda expressions (ie, no free variables).
fix f = f (fix f)
https://en.wikipedia.org/wiki/Fixed-point_combinator
System F
An extension of simply typed lambda calculus
Abstracts out varying types. Universal types.
A form of parametric polymorphism.
Well-typed terms in System F are normalizing.
- ie, evaluation of all terms terminates.
- For every term t, t is either in normalized form (ie, cannot be reduced further) or there is a typing rule
t → t'
that can reduce it (progress).
Type inference for System-F is undecidable. (Why??)
First discovered by the logician Jean-Yves Girard (1972) in the context of proof theory
Then independently by the computer scientist John Reynolds (1974)
- Reynolds named it Polymorphic lambda calculus.
aka second-order lambda calculus
- Corresponds to second-order intuitionistic logic via Curry-Howard correspondence.
- Allows quantification over types (predicates) in addition to terms.
Has been extensively used as a research vehicle to study polymorphism and as the basis of many programming language designs.
Examples
map function
-- assumptions: types of available functions
nil : ∀X. List X
cons : ∀X. X -> List X
isnil : ∀X. List X -> Bool
head : ∀X. List X -> X
tail : ∀X. List X -> List X (* everything except first element)
-- map function
λX. λY. λf:(X -> Y).
(fix (λm: (List X) -> (List Y). -- recursive function
λl: List X. -- input list
if isnil [X] l -- from where did 'if' come from??
then nil [Y]
else
cons [Y] (f (head [X] l))
(m (tail [X] l))))
reverse function
append: ∀X. (List X) -> (List X) -> (List X)
append =
λX. (fix (λm: (List X) -> (List X) -> (List X).
λa: (List X). λb: (List X).
if isnil [X] a
then b
else
cons [X] (head [X] a) (m (tail [X] a) y)))
reverse: ∀X. (List X) -> (List X)
reverse =
λX. (fix λm: (List X) -> (List X).
λa: (List X).
if isnil [X] a
then nil [X]
else
append [X] (m (tail [X] a)) (head [X] a))
Church encodings
- In untyped lambda calculus.
- Primitive data values like bool, nat, list, etc can be encoded.
- Encoded as functions.
System Fω
- Doesn't allow dependent types (which are mappings from values to types).
System Fω<:
- System Fω with subtyping.
New terms
- Universal lambda functions?? link
- Point-free programming
- Type reconstruction (tapl C22)
- Type erasure (tapl C9?)
Normalization
Evaluation of a well-typed term is guaranteed to halt in a finite number of steps. ie, every well-typed term is normalizable. (src: tapl)
- TODO: Strong normalization
Normal form
- a term that cannot be further evaluated.
- cannot take another step under the evaluation relation.
- already a value.
Not every term has a normal form. Terms without a normal form are said to diverge (Divergent term).
Par example, the term
(λx. x x) (λx. x x)
.
Any attempt to reduce this term will resultant in the original term itself.
Impredicativity
A definition (of a set/type etc) is impredicative if it involves a quantifier whose domain includes the very thing being defined. (src: tapl C23)
Monotypes
Type variables that range only over quantifier-free types. (src: tapl C23)
Polytype or Type scheme
Quantified types.
Subtypes
- aka second order lambda calculus (via an analogy with second-order intuitionalistic logic)
- Quantification over types (analogous to predicates) possible.
- In addition to quantification over terms
- A form of polymorphism (subtype polymorphism).
- Discovered in 1972 by Jean-Yves Girard
- A single term is given many types. (using rule of subsumption?)
- sml doesn't have subtyping.
Type preservation
- Meaning is not lost.
t → t'
meanst
reduces tot'
Γ ⊢ t : T t → t'
==========================
Γ ⊢ t' : T
Progress
- Related to normalizability.
- For every term in the language:the term is either in normalized form (cannot be reduced further. Is already a value), or there is a typing rule (
t → t'
) that takes it a step closer to normalized form.
External References
Types and Programming Languages by Benjamin Pierce
Advanced topics in Types and Programming Languages by Benjamin Pierce
Lambda-calculus and combinators, an introduction by Hindley and Seldin
An introduction to the lambda calculus for computer scientists by Hankin (very basic)
Lectures on the Curry-Howard isomorphism by Sorensen and Urzyczyn (advanced)
Basic simple type theory by Hindley (advanced)
The lambda calculus: its syntax and semantics by Barendregt (authoritative)
https://mathoverflow.net/questions/69337/what-is-some-good-introduction-to-lambda-calculus
Using this definition, we can define identity function (λx:T. T):
idfn = Abs A A (λx:Term A. x)
where A
is a type variable.
Logical frameworks
Systems that provide mechanisms for representing the syntax and proof systems which make up a logic.
Eg: HOAS
See: https://ncatlab.org/nlab/show/logical+framework
Barendregt cube
- Also known as lambda cube. ¹
- See https://en.wikipedia.org/wiki/Lambda_cube
Fω CC
+--------+
/| /|
F / | / |
+--------+ |
| | | |
| +-----|--+
| /λω | /
|/ |/
+--------+
λ→ LF
- Top face: systems with polymorphism (ie, have term families indexed by types).
- Back face: systems with type operators.
- Right face: systems with dependent types
- λ→: simply typed lambda calculus
- CC: Calculus of constructions
- Fω: System-Fω
- LF: Logical framework
- aka Lambda-P
System-F: polymorphic lambda calculus
- aka λ2