Standard ML


Based on some informal notes by Robert Harper.

https://www.cse.iitd.ac.in/~suban/COL100/smltutorial.pdf


ML

Meta-Language

Chapter 2

Types

Classification of types

Atomic types

Instances are constants of the type.

Compound types

Instances are built from value constructors.

Short-hand notation for records (and tuples)

Since accessing a specific element out of a record is an often used operation, we have a notation to do that.

The syntax is like :: val <ident> = #<field> <record>, where

This is essentially same as :: val {<field>=<ident>, ...} = <record>.

val r = {name="Jack", age=34}
val fname = #name r
(*
fname = "Jack" : string
*)

(*
Recall that the following yielded the same result:

val r = {name="Jack", age=34}
val {name=fname, age=num} = r
*)

And since tuples are just a special kind of records, this notation can be used to 'index' the elements of a tuple.

In this case, the 'indexing' starts from (1 and not 0).

val t = ("one", 2, true)
val first = #1 t
val second = #2 t
val third = #3 t
(*
val first = "one" : string
val second = 2 : int
val third = true : bool
*)

Trying to access out of bounds of the record/tuple would give error though.

Equality

Variables

Value identifiers: Another name for variables in sml.

(* Value binding *)
val pi = 3.14  (* 3.14 : real *)

Variables in sml are not mere assignments.

Once an identifier is bound to a value, the value cannot be changed. Though the identifier may be rebound to another value.

val x = 1
val y = x
val x = 2
(* Even if value of [x] changed, [y] is still [1] *)

Binding multiple variables at once

Separate with an and.

val x = 10 and y = 20 and z = 30;

Notice this, though:

val x = 1
val x = true and y = x
(* Value of [y] is [1]! Not [true].
   Because that's the context at the time.
   [x] was not yet changed when y is defined as they are in the same line.
*)

Because the bindings made with the and are evaluated parallelly. The RHS of all expressions (like true and x in the above example) are evaluated first and then the resultant values are bound to the variables at once.

An environment is maintained which keeps track of the meaning of all the identifiers.

Sequential composition of variables

Semicolons can be used to create variables one by one (sequentially, instead of doing the same parallelly with and).

val x = 3; val x = true and z = 5
(*
val x = 3 : int
val x = true : bool
val z = 5 : int
*)

Local variable declarations

Useful to assist in the creation of other variables.

Like:

local
  val x = 2
in
  val y = x div x
  val z = x*y + 3*x
end;
(*
val y = 1 : int
val z = 8 : int
*)

Here, x is available only during the time of binding of value to the variables y and z. ie, x is local to the binding of y and z.

Similarly, we can use let to make a variable that will be kept local to an expression.

let
  val x = 10
in
  x*x + 2*x + 1
end;
(*
121 : int
*)

The part between the in and end is evaluated by the environment augmented by the declarations occurring between let and in.

Operators

Some operators

< <= > >= = <>
+ /
sin sqrt exp div mod real
floor size ^
Operand1 Operator Operand2 Result
"hello " ^ "world" "hello world"
34 = 34 true
22 = 22.0 Error
Operator Operand Result
size "hello" 5
real 3 3.0 : real
floor 3.14 3 : int
ceil 3.14 4 : int

Some boolean operators

Operand1 Operator Operand2 Result
false andalso true false
true andalso false false
true andalso true true
false andalso false false
false oralso true false
true oralso false false
true oralso true true
false oralso false false

Comments

Comments may be nested.

Pattern matching

Patterns
Made up from variables and constants using value constructors.
Wildcard pattern

_ used to match something that may then be ignored.

  • No binding is created.
  • A don't care.
val x = (12, true)
(*
(12, true) : int * bool
*)

val (left, right) = x   (* Pattern! *)
(*
left = 12 : int
right = true : bool
*)

(* ** Wildcard pattern *)
val hd::_ = [1,2,3,4];
(* hd = 1 : int *)

(* ** Pattern matching on records *)
val r = {name="Jack", age=34}
val {name=fname, age=num} = r
(*
*)

(* ** Patten matching partially on a record *)
val r = {name="Jack", age=34}
val {name=fname, ...} = r   (* Use only [name] *)
(*
fname = "Jack" : string
*)

Layered pattern matching

Useful for building complex patterns.

val x = (("left", "mid"), "right")
val (l as (left, mid), right) = x
(*
l = ("left","mid") : string * string    (* two birds with one stone! *)
left = "left" : string
mid = "mid" : string
right = "right" : string
*)

Functions

Polymorphic functions

Type of function can be dependent on the arguments.

fun append ([], y) = y
  | append (x::xs, y) = x :: append (xs, y)
(* append = fn : 'a list -> 'a list -> 'a list *)

append ([1,2], [3,4])
(* [1,2,3,4] : int list *)

append ([1.1,2.2], [3.3,4.4]);
(* [1.1,2.2,3.3,4.4] : real list *)

 append (["We're", "having"], ["some", "good", "weather"]);
(* ["We're"," having","some","good","weather"] : string list *)

append ([true, false], [false, true]);
(* [true,false,false,true] : bool list *)

In the above example, the function append can operate on arguments of different types because it is polymorphic.

Because the body of append makes no assumption that is specific to a particular with regard to its arguments.

User-defined functions

Some examples:

fun twice x = 2 * x      (* twice = fn : int -> int *)
twice 10                 (* 20 : int *)


fun fact x = if x > 0 then x * (fact (x-1)) else 1
                         (* fact = fn : int -> int *)
fact 5                   (* 120 : int *)


fun plus (x, y) = x + y  (* plus = fn : int -> int -> int *)
plus(3, 2)               (* 5 : int *)


(* nil is same as [] *)
fun is_nil (nil) = true
  | is_nil (_ :: _) = false
(* is_nil = fn : 'a list -> bool *)
is_nil []                (* true *)
is_nil [1, 2]            (* false *)


fun reverse (x::xs) = (reverse xs) :: [x]
  | reverse [] = []
fun is_perfect x = 
  let
    fun is_factor (n, f) = if (n mod f = 0) then true else false

  in

TODO: XXX
Type constraint

Consider a function fun plus (x, y) : int = x + y.

The : int part is a type constraint.

Its purpose here is to say that only int addition is allowed and that real got to stay away.

(Although this seems to be what is inferred by sml in the case of this example by default. Even when the type constraint was absent here.)

(* Can handle only integers *)
fun plus (x, y) : int = x + y     (* Same as [plus (x, y) = x + y] *)
(* fn : int * int -> int *)

plus (3, 4)
(* 7 : int *)

plus (3.1, 4.2)
(*
Error: operator and operand don't agree [tycon mismatch]
  operator domain: int * int
  operand:         real * real
  in expression:
    plus (3.1, 4.2)
*)
(* Can handle only real numbers *)
fun plusReal (x, y) : real = x + y
(* fn : real * real -> real *)

plusReal (3.1, 4.2)
(* 7.3 : real *)

plusReal (3, 4)
(*
Error: operator and operand don't agree [overload conflict]
  operator domain: real * real
  operand:         [int ty] * [int ty]
  in expression:
    plusReal (3,4)
*)

Function type

Lambda functions

val x = fn x => x + 1
(*
x = fn : int -> int
*)
x 5
(*
6 : int
*)

Some interesting facts

Special values

Defining new types

Three forms of type bindings.

Transparent type binding (aka type abbreviation)

Essentially just giving another name for a (usually complex) type.

Can be used to abbreviate a complex type expression, leading to improved readability.

type intpair = int * int
(* There is no difference between [int * int] and [intpair] *)

fun f(x:intpair) = let
                     val (l, r) = x
                   in
                     l
                   end

type 'a pair = 'a * 'a
type boolpair = bool pair

Compound types

Uses the datatype keyword.

Give a name, some type parameters (optional) and a set of constructors. To extend the type system of ML.

A datatype declaration creates a new type constructor and a set of value constructors for that type.

(* Create a new type [color] with the 3 constructors *)
datatype color = Red | Green | Blue

fun isRed Red = true
  | isRed _ = false


(* Another example *)
datatype dinero = broke                  (* no money *)
                | coin of int            (* denomination *)
                | note of int            (* denomination *)
                | check of string * int  (* bank name, amount *)

fun totalCoins (broke) = 0
  | totalCoins (coin(count)) = count
  | totalCoins (note(count)) = count * 100
  | totalCoins (check(bank, count)) = count

(* val totalCoins = fn : dinero -> int *)

totalCoins(broke);
(* val it = 0 : int *)

totalCoins(coin(100));
(* val it = 100 : int *)

totalCoins(note(100));
(* val it = 10000 : int *)

Data types can be recursive. As in the case of a binary tree.

datatype btree = empty 
               | leaf 
               | node of btree * btree

fun btreeLen (empty) = 0
  | btreeLen (leaf) = 1
  | btreeLen (lchild, rchild) = btreeLen lchild + btreeLen rchild

Here's a parametric datatype:

(* Typed binary tree *)
datatype 'a tbtree = empty 
                   | leaf of 'a
                   | node of 'a btree * 'a btree

Abstract data types

Uses the abstype keyword.

A data type with a set of functions defined on it.

Terms:

Constructors of the implementation type are hidden from the client programs. Only the interface is visible to the clients. This allows the programmer to change the implementation details without affecting the way the clients use it.

Client is insulated from the implementation details.

Only the functions in the interface have access to the implementation details.

(* The interface comes between the [with] and the last [end] *)
abstype color = blend of int * int * int
with val white = blend (255, 255, 255)
     and red = blend (255, 0, 0)
     and green = blend (0, 255, 0)
     and blue = blend (0, 0, 255)
     fun mix(parts: int, blend(r,g,b),
             parts': int, blend(r',g',b')) =
         if parts<0 orelse parts'<0 then white
         else let val t = parts + parts'
                  and rr = (parts*r + parts'*r') div t
                  and gg = (parts*g + parts'*g') div t
                  and bb = (parts*b + parts'*b') div t
              in blend(rr, gg, bb)
              end
end;
(* Okay, smlnj gives error for this example. But I guess we can get the idea. *)

Mutable values

Description Code
Create reference cell ref
Get val of a ref cell !var
Modify value var := newval
> ref
'a -> 'a ref

> !
'a ref -> 'a

> op :=
'a ref * 'a -> unit

Sequential composition of statements in sml, as in

exp1; exp2

is syntactic sugar for

let
  val _ = exp1
in
  exp2
end

Note: Functions of type typ -> unit are also known as procedures as they are executed purely for their effects. Their return value is not used.

Thunks

Module system (Chapter 3)

A large program organized as relatively independent smaller modules with well-defined interfaces ⇒ easier maintenance

Motivation

structure

ML: structure (stands for environment structure) (Non-ML languages: Modules/packages/clusters)

Because an environment gives meanings to all the identifiers.

When a variable is declared like val x = 3, the environment remembers that now there is an identifier x whose value is 3 of type int.

signature

Functor

functor : structure → structure

mini-ML

sml can be translated into a minimal subset known as mini-ML.

mini-ML doesn't have the following:

Continuations in smlnj

28-Nov-2022

smlnj's continuation type:

- open SMLofNJ.Cont;
opening SMLofNJ.Cont
  type 'a cont = 'a ?.cont
  val callcc : ('a cont -> 'a) -> 'a
  val throw : 'a cont -> 'a -> 'b
  val isolate : ('a -> unit) -> 'a cont
  type 'a control_cont = 'a ?.InlineT.control_cont
  val capture : ('a control_cont -> 'a) -> 'a
  val escape : 'a control_cont -> 'a -> 'b

See:

CPS

Makes every aspect of data and control flow explicit.[1]

Tips

Settings

Reference: https://www.smlnj.org/doc/Compiler/pages/printcontrol.html

Equality operation

It's =, not ==.

- 1=2;
val it = false : bool

fst and snd for a tuple

It's like indexing in sml.

#1 (1,2,3);
val it = 1 : int
- #0 (1,2,3);
stdIn:115.2-115.5 Error: syntax error: deleting  INT0 LPAREN
stdIn:115.6-115.9 Error: syntax error: deleting  COMMA INT COMMA
- #2 (1,2,3);
val it = 2 : int
- #4 (1,2,3);
stdIn:116.1-116.11 Error: operator and operand don't agree [record labels]
  operator domain: {4:'Y; 'Z}
  operand:         [int ty] * [int ty] * [int ty]
  in expression:
    (fn {4=4,...} => 4) (1,2,3)
- 
- #3 (1,2,3);
val it = 3 : int

op

To use an infix operator as a normal function.

- 1 + 2;
val it = 3 : int
- op+ (1,2);
val it = 3 : int


- 5 mod 6;
val it = 5 : int
- op mod (5, 6);
val it = 5 : int

See: https://stackoverflow.com/questions/54812817/the-op-operator-in-sml

char

Character literals are denoted with a string of length 1 prefixed by a #.

Eg:

- #"a";
val it = #"a" : char
- #"\\";
val it = #"\\" : char
- #"\n";
val it = #"\n" : char
- #"3";
val it = #"3" : char


- explode "abc";
val it = [#"a",#"b",#"c"] : char list


- #"aa";
stdIn:19.1-19.5 Error: character constant not length 1

List concatenation

- [1,2] @ [3,4];
val it = [1,2,3,4] : int list

Exercises

2.5.1

fun cirumference r = 2 * 3.14 * r
fun area r = 3.14 * r * r

2.5.2

fun abs x : real = if x < 0 then ~x else x

2.7.2 (incomplete)

abstype 'a set = set of 'a list
  with val emptyset: 'a set = 
  fun singleton (e: 'a): 'a set =
  fun union (s1: 'a set, s2: 'a set): 'a set =
  fun member (e: 'a, s: 'a set): bool =
    | member (e, set (h :: t)) = (e = h)
                                 orelse member(e, set t)
  fun intersection(s1: 'a set, s2: 'a set): 'a set = 
end;

Abstract types

Exceptions

(* exception exception_name of args *)
(* exception empty_list of unit *)

exception empty_list

(*
>>> empty_list;;
val it = empty_list(-) : exn
*)

(* hd: 'a list -> 'a *)
fun head nil = raise empty_list
  | head (x::xs) = x

fun append l1 l2 = 
  (head l1) :: (append (tl l1) l2)
  handle empty_list => l2

(*
>>> append [1] [2,3];;
val it = [1,2,3] : int list
*)

See:

Doubts

Misc

Bitvector

smlnj has a BitArray structure.

It seems have had a BitVector structure as well ??

Matrices

There is an Array2 structure for implementing 2D arrays.

References

Anonymous functions and parenthesis

Parenthesis placement can be a source of confusion.

For example:

fun foo nil = fn x => true
  | foo _ = fn y => false

would give error.

Error: syntax error: replacing  EQUALOP with  DARROW

This is because the parser thinks that the second line is continuation of the anonymous function described in the first line.

Make it clear to the parser that they are separate functions and that the second line is not another case of the first anonymous function with explicit parentheses.

fun foo nil = (fn x => true)
  | foo _ = fn y => false