#+LINK c1 https://www.smlnj.org/doc/CM/index.html
Based on some informal notes by Robert Harper.
https://www.cse.iitd.ac.in/~suban/COL100/smltutorial.pdf
ML
Meta-Language
- a functional programming language
- interactive
- strongly typed
- polymorphic type system
- supports abstract types
- statically scoped: identifiers references are resolved at compile time itself
- Has a type-safe exception mechanism
- Programs can be broken down into modules for incremental construction.
- An ML program = interdependent collection of structures held together by functors.
- Separate compilation possible by export or import of functors.
Chapter 2
Types
Classification of types
Atomic types
Instances are constants of the type.
unit
bool
int
real
string
Compound types
Instances are built from value constructors.
- Tuple: like
(true, ()) : bool * unit
- List:
[1,2]
which is same as1 :: 2 :: 3
- Record: like
{name="Jack Doe", age=34} : {age:int, name:string}
Short-hand notation for records (and tuples)
Since accessing a specific element out of a record is an often used operation, we have a notation to do that.
The syntax is like :: val <ident> = #<field> <record>
, where
<field>
is the field being accessed<record>
is the record (or tuple)<ident>
identifier for the variable being set
This is essentially same as :: val {<field>=<ident>, ...} = <record>
.
val r = {name="Jack", age=34}
val fname = #name r
(*
fname = "Jack" : string
*)
(*
Recall that the following yielded the same result:
val r = {name="Jack", age=34}
val {name=fname, age=num} = r
*)
And since tuples are just a special kind of records, this notation can be used to 'index' the elements of a tuple.
In this case, the 'indexing' starts from (1
and not 0
).
val t = ("one", 2, true)
val first = #1 t
val second = #2 t
val third = #3 t
(*
val first = "one" : string
val second = 2 : int
val third = true : bool
*)
Trying to access out of bounds of the record/tuple would give error though.
Equality
- Any type that an equality test on its values is said to admit equality.
- Every atomic type admit equality.
- Function types don't admit equality.
- For compound types, a type
α -> τ
admits equality iff bothα
andτ
admits equality
Variables
Value identifiers: Another name for variables in sml.
(* Value binding *)
val pi = 3.14 (* 3.14 : real *)
Variables in sml are not mere assignments.
Once an identifier is bound to a value, the value cannot be changed. Though the identifier may be rebound to another value.
val x = 1
val y = x
val x = 2
(* Even if value of [x] changed, [y] is still [1] *)
Binding multiple variables at once
Separate with an and
.
val x = 10 and y = 20 and z = 30;
Notice this, though:
val x = 1
val x = true and y = x
(* Value of [y] is [1]! Not [true].
Because that's the context at the time.
[x] was not yet changed when y is defined as they are in the same line.
*)
Because the bindings made with the and
are evaluated parallelly. The RHS of all expressions (like true
and x
in the above example) are evaluated first and then the resultant values are bound to the variables at once.
An environment is maintained which keeps track of the meaning of all the identifiers.
Sequential composition of variables
Semicolons can be used to create variables one by one (sequentially, instead of doing the same parallelly with and
).
val x = 3; val x = true and z = 5
(*
val x = 3 : int
val x = true : bool
val z = 5 : int
*)
Local variable declarations
Useful to assist in the creation of other variables.
Like:
local
val x = 2
in
val y = x div x
val z = x*y + 3*x
end;
(*
val y = 1 : int
val z = 8 : int
*)
Here, x
is available only during the time of binding of value to the variables y
and z
. ie, x
is local to the binding of y
and z
.
Similarly, we can use let
to make a variable that will be kept local to an expression.
let
val x = 10
in
2*x + 1
x*x + end;
(*
121 : int
*)
The part between the in
and end
is evaluated by the environment augmented by the declarations occurring between let
and in
.
Operators
Some operators
< | <= | > | >= | = | <> |
+ | / | ||||
sin | sqrt | exp | div | mod | real |
floor | size | ^ | |||
Operand1 | Operator | Operand2 | Result |
---|---|---|---|
"hello " | ^ | "world" | "hello world" |
34 | = | 34 | true |
22 | = | 22.0 | Error |
Operator | Operand | Result |
size | "hello" | 5 |
real | 3 | 3.0 : real |
floor | 3.14 | 3 : int |
ceil | 3.14 | 4 : int |
Some boolean operators
Operand1 | Operator | Operand2 | Result |
---|---|---|---|
false | andalso | true | false |
true | andalso | false | false |
true | andalso | true | true |
false | andalso | false | false |
false | oralso | true | false |
true | oralso | false | false |
true | oralso | true | true |
false | oralso | false | false |
Comments
Comments may be nested.
- Starts with:
(*
- Ends with:
*)
Pattern matching
- Patterns
- Made up from variables and constants using value constructors.
- Wildcard pattern
_
used to match something that may then be ignored.- No binding is created.
- A don't care.
val x = (12, true)
(*
(12, true) : int * bool
*)
val (left, right) = x (* Pattern! *)
(*
left = 12 : int
right = true : bool
*)
(* ** Wildcard pattern *)
val hd::_ = [1,2,3,4];
(* hd = 1 : int *)
(* ** Pattern matching on records *)
val r = {name="Jack", age=34}
val {name=fname, age=num} = r
(*
*)
(* ** Patten matching partially on a record *)
val r = {name="Jack", age=34}
val {name=fname, ...} = r (* Use only [name] *)
(*
fname = "Jack" : string
*)
Layered pattern matching
Useful for building complex patterns.
val x = (("left", "mid"), "right")
val (l as (left, mid), right) = x
(*
l = ("left","mid") : string * string (* two birds with one stone! *)
"left" : string
left = "mid" : string
mid = "right" : string
right = *)
Functions
Multiple arguments are passed using tuples.
Functions are also values.
Can't do pattern matching on functions.
We can't check equality of two functions (Is an undecidable problem).
Clausal function definitions :: Function definition with a clause for every constructor of argument.
Higher-order functions :: Functions that take other functions as arguments.
Polymorphic functions
Type of function can be dependent on the arguments.
fun append ([], y) = y
| append (x::xs, y) = x :: append (xs, y)(* append = fn : 'a list -> 'a list -> 'a list *)
1,2], [3,4])
append ([(* [1,2,3,4] : int list *)
1.1,2.2], [3.3,4.4]);
append ([(* [1.1,2.2,3.3,4.4] : real list *)
"We're", "having"], ["some", "good", "weather"]);
append ([(* ["We're"," having","some","good","weather"] : string list *)
true, false], [false, true]);
append ([(* [true,false,false,true] : bool list *)
In the above example, the function append
can operate on arguments of different types because it is polymorphic.
Because the body of append
makes no assumption that is specific to a particular with regard to its arguments.
User-defined functions
- The keyword
fun
is used to define functions.fun
introduces function bindings.
Some examples:
fun twice x = 2 * x (* twice = fn : int -> int *)
10 (* 20 : int *)
twice
fun fact x = if x > 0 then x * (fact (x-1)) else 1
(* fact = fn : int -> int *)
5 (* 120 : int *)
fact
fun plus (x, y) = x + y (* plus = fn : int -> int -> int *)
3, 2) (* 5 : int *)
plus(
(* nil is same as [] *)
fun is_nil (nil) = true
false
| is_nil (_ :: _) = (* is_nil = fn : 'a list -> bool *)
(* true *)
is_nil [] 1, 2] (* false *)
is_nil [
fun reverse (x::xs) = (reverse xs) :: [x]
| reverse [] = []
fun is_perfect x =
let
fun is_factor (n, f) = if (n mod f = 0) then true else false
in
TODO: XXX
Type constraint
Consider a function fun plus (x, y) : int = x + y
.
The : int
part is a type constraint.
Its purpose here is to say that only int
addition is allowed and that real
got to stay away.
(Although this seems to be what is inferred by sml in the case of this example by default. Even when the type constraint was absent here.)
(* Can handle only integers *)
fun plus (x, y) : int = x + y (* Same as [plus (x, y) = x + y] *)
(* fn : int * int -> int *)
3, 4)
plus ((* 7 : int *)
3.1, 4.2)
plus ((*
Error: operator and operand don't agree [tycon mismatch]
operator domain: int * int
operand: real * real
in expression:
plus (3.1, 4.2)
*)
(* Can handle only real numbers *)
fun plusReal (x, y) : real = x + y
(* fn : real * real -> real *)
3.1, 4.2)
plusReal ((* 7.3 : real *)
3, 4)
plusReal ((*
Error: operator and operand don't agree [overload conflict]
operator domain: real * real
operand: [int ty] * [int ty]
in expression:
plusReal (3,4)
*)
Function type
- Is of type
α -> τ
. α
is the domain typeτ
is the range type
Lambda functions
- Can be defined using the
fn
keyword.
val x = fn x => x + 1
(*
x = fn : int -> int
*)
5
x (*
6 : int
*)
Some interesting facts
unit
type consists of one value:()
.- Used when an expression has no interesting value or when a function is not to have arguments.
if
statement:else
clause is mandatory and may not be omitted.- Both
e1
ande2
inif e then e1 else e2
must be of the same type.
- Both
int
: Negative values are denoted using~
instead of-
.real
: Floating point numbers in sml- Functions like
+
and-
can be applied only to operands of same type. Mix and match not allowed.3 + 4
✓3.14 + 5
✗
div
(can use/
instead) andmod
not defined forreal
.- Type variables start with an
'
. Like'a list
- Tuple is actually a special type of record.
- A tuple type
σ * τ
is same as the record type{1: α, 2: τ}
where the field names are just numbers starting from 1. - So a tuple
{5,6}
is effectively same as{1: 5, 2: 6}
.
- A tuple type
- (Informally,) Integers are first-order objects.
Special values
NONE
:'a option
SOME
:fn : 'a -> 'a option
Defining new types
Three forms of type bindings.
Transparent type binding (aka type abbreviation)
Essentially just giving another name for a (usually complex) type.
Can be used to abbreviate a complex type expression, leading to improved readability.
type intpair = int * int
(* There is no difference between [int * int] and [intpair] *)
fun f(x:intpair) = let
val (l, r) = x
in
lend
type 'a pair = 'a * 'a
type boolpair = bool pair
Compound types
Uses the datatype
keyword.
Give a name, some type parameters (optional) and a set of constructors. To extend the type system of ML.
A datatype
declaration creates a new type constructor and a set of value constructors for that type.
(* Create a new type [color] with the 3 constructors *)
datatype color = Red | Green | Blue
fun isRed Red = true
false
| isRed _ =
(* Another example *)
datatype dinero = broke (* no money *)
of int (* denomination *)
| coin of int (* denomination *)
| note of string * int (* bank name, amount *)
| check
fun totalCoins (broke) = 0
| totalCoins (coin(count)) = count100
| totalCoins (note(count)) = count *
| totalCoins (check(bank, count)) = count
(* val totalCoins = fn : dinero -> int *)
totalCoins(broke);(* val it = 0 : int *)
100));
totalCoins(coin((* val it = 100 : int *)
100));
totalCoins(note((* val it = 10000 : int *)
Data types can be recursive. As in the case of a binary tree.
datatype btree = empty
| leaf of btree * btree
| node
fun btreeLen (empty) = 0
1
| btreeLen (leaf) = | btreeLen (lchild, rchild) = btreeLen lchild + btreeLen rchild
Here's a parametric datatype:
(* Typed binary tree *)
datatype 'a tbtree = empty
of 'a
| leaf of 'a btree * 'a btree | node
Abstract data types
Uses the abstype
keyword.
A data type with a set of functions defined on it.
Terms:
- The type itself: implementation type
- The functions: interface of the type
- Programs that use the implementation type: clients
Constructors of the implementation type are hidden from the client programs. Only the interface is visible to the clients. This allows the programmer to change the implementation details without affecting the way the clients use it.
Client is insulated from the implementation details.
Only the functions in the interface have access to the implementation details.
(* The interface comes between the [with] and the last [end] *)
abstype color = blend of int * int * int
with val white = blend (255, 255, 255)
and red = blend (255, 0, 0)
and green = blend (0, 255, 0)
and blue = blend (0, 0, 255)
fun mix(parts: int, blend(r,g,b),
int, blend(r',g',b')) =
parts': if parts<0 orelse parts'<0 then white
else let val t = parts + parts'
and rr = (parts*r + parts'*r') div t
and gg = (parts*g + parts'*g') div t
and bb = (parts*b + parts'*b') div t
in blend(rr, gg, bb)
end
end;
(* Okay, smlnj gives error for this example. But I guess we can get the idea. *)
Mutable values
Description | Code |
---|---|
Create reference cell | ref |
Get val of a ref cell | !var |
Modify value | var := newval |
> ref
'a -> 'a ref
> !
'a ref -> 'a
> op :=
'a ref * 'a -> unit
Sequential composition of statements in sml, as in
exp1; exp2
is syntactic sugar for
let
val _ = exp1
in
exp2end
Note: Functions of type typ -> unit
are also known as procedures as they are executed purely for their effects. Their return value is not used.
Thunks
- An unevaluated expression.
- aka computation
- aka suspension
- For lazy evaluation
Module system (Chapter 3)
A large program organized as relatively independent smaller modules with well-defined interfaces ⇒ easier maintenance
Motivation
- Ability for each module to be compiled separately.
- Way to assemble the modules into the complete program.
structure
ML: structure (stands for environment structure) (Non-ML languages: Modules/packages/clusters)
Because an environment gives meanings to all the identifiers.
When a variable is declared like val x = 3
, the environment remembers that now there is an identifier x
whose value is 3
of type int
.
- Program units
- For making sml programs modular.
- Consists of a collection components, including types and values.
signature
- Spec or description of a program unit.
- Like the type of a structure.
- Describes the structure to the outside world.
- Information that's known about the structure at compile time.
- Can explicitly define a signature to control what all things of the structure will be made visible to the outside world.
:>
are opaque,:
are transparently ascribed.
Functor
functor : structure → structure
- A function from structures to structures.
- Parametrized program units.
- ie, parametrized structures.
- Takes a structure and returns another structure.
- Basis for abstraction, a form of information hiding.
- Non-ML terms: Parametrized module or generalized package
mini-ML
sml can be translated into a minimal subset known as mini-ML.
mini-ML doesn't have the following:
- let expressions
- module system (ie, no structures and functors)
- pattern matching
- abstract types ??
Continuations in smlnj
28-Nov-2022
smlnj's continuation type:
- open SMLofNJ.Cont;
opening SMLofNJ.Cont
type 'a cont = 'a ?.cont
val callcc : ('a cont -> 'a) -> 'a
val throw : 'a cont -> 'a -> 'b
val isolate : ('a -> unit) -> 'a cont
type 'a control_cont = 'a ?.InlineT.control_cont
val capture : ('a control_cont -> 'a) -> 'a
val escape : 'a control_cont -> 'a -> 'b
See:
- https://en.wikipedia.org/wiki/Call-with-current-continuation
- https://www.smlnj.org/doc/SMLofNJ/pages/cont.html#SIG:CONT.callcc:VAL
- https://gitlab.com/piyush-kurur/popl/-/raw/master/examples/sml/continuation/threads.sml
- https://gitlab.com/piyush-kurur/popl/-/raw/master/examples/sml/continuation/demo.sml
CPS
Makes every aspect of data and control flow explicit.1
Tips
Settings
Reference: https://www.smlnj.org/doc/Compiler/pages/printcontrol.html
- Show objects instead of
#
:Control.Print.printDepth := 20
Equality operation
It's =
, not ==
.
1=2;
- val it = false : bool
fst and snd for a tuple
It's like indexing in sml.
1 (1,2,3);
#val it = 1 : int
0 (1,2,3);
- #115.2-115.5 Error: syntax error: deleting INT0 LPAREN
stdIn:115.6-115.9 Error: syntax error: deleting COMMA INT COMMA
stdIn:2 (1,2,3);
- #val it = 2 : int
4 (1,2,3);
- #116.1-116.11 Error: operator and operand don't agree [record labels]
stdIn:4:'Y; 'Z}
operator domain: {int ty] * [int ty] * [int ty]
operand: [in expression:
fn {4=4,...} => 4) (1,2,3)
(
- 3 (1,2,3);
- #val it = 3 : int
op
To use an infix operator as a normal function.
1 + 2;
- val it = 3 : int
op+ (1,2);
- val it = 3 : int
5 mod 6;
- val it = 5 : int
op mod (5, 6);
- val it = 5 : int
See: https://stackoverflow.com/questions/54812817/the-op-operator-in-sml
char
Character literals are denoted with a string of length 1 prefixed by a #
.
Eg:
- #"a";
val it = #"a" : char
- #"\\";
val it = #"\\" : char
- #"\n";
val it = #"\n" : char
- #"3";
val it = #"3" : char
- explode "abc";
val it = [#"a",#"b",#"c"] : char list
- #"aa";
stdIn:19.1-19.5 Error: character constant not length 1
List concatenation
1,2] @ [3,4];
- [val it = [1,2,3,4] : int list
Exercises
2.5.1
fun cirumference r = 2 * 3.14 * r
fun area r = 3.14 * r * r
2.5.2
fun abs x : real = if x < 0 then ~x else x
2.7.2 (incomplete)
abstype 'a set = set of 'a list
with val emptyset: 'a set =
fun singleton (e: 'a): 'a set =
fun union (s1: 'a set, s2: 'a set): 'a set =
fun member (e: 'a, s: 'a set): bool =
| member (e, set (h :: t)) = (e = h)orelse member(e, set t)
fun intersection(s1: 'a set, s2: 'a set): 'a set =
end;
Abstract types
- Kind of like classes in OOP-oriented languages.
- Perhaps the same can be achieved via signatures and structs ??
- Keywords:
abstype
…with
- No pattern matching on values since type is abstract. Has to rely on associated operations
Exceptions
(* exception exception_name of args *)
(* exception empty_list of unit *)
exception empty_list
(*
>>> empty_list;;
val it = empty_list(-) : exn
*)
(* hd: 'a list -> 'a *)
fun head nil = raise empty_list
| head (x::xs) = x
fun append l1 l2 =
(head l1) :: (append (tl l1) l2)handle empty_list => l2
(*
>>> append [1] [2,3];;
val it = [1,2,3] : int list
*)
exceptiontype: is the value returned while raising the exception ??(* exception exceptionname of exceptiontype *)
- value returned can be
handle
-ed exn
: sml's only extensible data type that is built-in- So can there be user defined extensible types ??
See:
Doubts
- Can we declare a variable explicitly to be of a particular type? As in the
intpair
example? - Infinitely large numbers
- orelse, andif
- integer division with
div
(infix)op div
is of typeint * int -> int
Misc
Bitvector
smlnj has a BitArray structure.
It seems have had a BitVector structure as well ??
Matrices
There is an Array2 structure for implementing 2D arrays.
- Could be handy for implmenting matrices.
- Built-in in smlnj
References
- link
- Programming in Standard ML (DRAFT: VERSION 1.2 OF 11.02.11.) by Robert Harper
- Standard ML Family GitHub Project
Anonymous functions and parenthesis
Parenthesis placement can be a source of confusion.
For example:
fun foo nil = fn x => true
fn y => false | foo _ =
would give error.
Error: syntax error: replacing EQUALOP with DARROW
This is because the parser thinks that the second line is continuation of the anonymous function described in the first line.
Make it clear to the parser that they are separate functions and that the second line is not another case of the first anonymous function with explicit parentheses.
fun foo nil = (fn x => true)
fn y => false | foo _ =
↩︎Compiling with continuations - Andrew Appel