Streams and Lazy Evaluation in OCaml

November 18, 2016

In this short post we’ll discuss lazy evaluation in OCaml and study a data structure called Stream. It’s based mainly on chapter 4 of Purely Functional Data Structures and it’s part of a series of study notes on that book.

Lazy Evaluation

Lazy evaluation is a property in which an expression is not evaluated immediately (suspended) and when it’s evaluated the first time, the subsequent calls are cached (memoized). Functional languages like Haskell are lazy evaluated, but not OCaml, which is eagerly evaluated. Because the results are memoized, expressions that are lazily evaluated must always return the same value given the same inputs. In Haskell it’s easy to enforce because functions are pure, that is, they do not rely on side effects.

Lazy?

Lazy? Source: Flickr – Brian Gratwicke

In the book the author defines a notation for lazy evaluation:

datatype a susp = $ of a

In OCaml, we can work with lazily evaluated expressions through the Lazy module. The definition of a suspension is similar:

type 'a t = 'a lazy_t

and we can use the lazy construct. Let’s define a simple expensive function, a naive Fibonacci, which runs at O(2^n) time:

let rec fibo n =
  if n <= 1 then 1
  else (fibo (n - 1)) + (fibo (n - 2))
;;

We can create a lazy evaluated version of it:

let lazy_fibo n = lazy (fibo n);;

We can see that by assigning it to a variable, it doesn’t cause the function to be executed:

let r = lazy_fibo 42;;

The author defines a matching operator ($) that causes a lazy expression to be evaluated, but I couldn’t find a corresponding operator in OCaml. Nevertheless, the Lazy module has the force() function, which does exactly that:

Lazy.force r;; // It might take a while!

Note that if we execute the same expression again, the second time it returns much faster, because of the memoization.

We are now ready to introduce the stream data structure.

Streams

A stream is a lazy version of a linked list. Recall that a linked list is composed of nodes which point to the next node, or equivalently, to the remaining of the list. The usual definition of a linked list is:

type 'a node = Nil | Node of 'a * 'a node

If we want to be explicit that a node is actually pointing to a sublist, we could use an intermediate type, listType:

type 'a node = Nil | Node of 'a * 'a list
and 'a list = 'a node

Note that node and list are mutually recursive (they depend on each other), so we have to define them together by using the and construct.

In a stream the pointer to the remaining of the list is lazily evaluated, so the type is:

type 'a streamCell = Nil | StreamCell of 'a * 'a stream
and
'a stream = ('a streamCell) Lazy.t

With this basic structure we can implement many of the list functions for streams.

Concatenation

Let’s start with the concat operator (++):

let rec (++) (streamA: 'a stream) (streamB: 'a stream): ('a stream) =
  let computedStreamA = Lazy.force streamA in
  match computedStreamA with
    | Nil -> streamB
    | StreamCell (elem, rest) -> lazy (StreamCell (elem, rest ++ streamB))
;;

Note that it never evaluates streamB and it only evaluates the first cell of streamA.

To help us testing, we can define function to convert from a list:

let rec fromList (l: 'a list): ('a stream) = match l with
  | [] -> lazy Nil
  | x :: xs -> lazy (StreamCell (x, fromList xs))
;;

and a function that forces the evaluation of the entire stream, essentially converting it back to a list:

let rec toList (stream: 'a stream): ('a list) =
  let computedStream = Lazy.force stream in
  match computedStream with
    | Nil -> []
    | StreamCell (elem, rest) -> elem :: (toList rest)
;;

Take

The take(n) function returns the first n elements from a stream. Like the concat function, only the first node of the stream is evaluated. The recursive call is suspended.

let rec take (n: int) (stream: 'a stream) : ('a stream) =
  if n == 0 then lazy Nil
  else
    let computedStream = Lazy.force stream in
    match computedStream with
      | Nil -> lazy Nil
      | StreamCell (elem, rest) -> lazy (StreamCell (elem, (take (n - 1) rest)))
;;

Drop

The drop(n) function removes the first n elements from a stream and returns the result. In this case, we need to evaluate all the n recursive calls:

let rec drop (n: int) (stream: 'a stream): ('a stream) =
  if n == 0 then stream
  else
    let computedStream = Lazy.force stream in
    match computedStream with
      | Nil -> lazy Nil
      | StreamCell (_, rest) -> drop (n - 1) rest
;;

take and drop look very similar but one is lazy while the other is not. That’s because the head of the stream is not suspended, but the tail is. In the drop case we need to find the (n+1)-th element that will be the new head of the stream. In the take case, we’re not changing the head, and since the tail is suspended, it can wait.

Reverse

The reverse function reverts the order the elements in a stream. In this case it’s more obvious that since we’re changing the location of the head, it must be eagerly evaluated.

let reverse (stream: 'a stream): ('a stream) =
  let rec reverse' = fun oldStream newStream ->
    let computedStream = Lazy.force oldStream in
    match computedStream with
      | Nil -> newStream
      | StreamCell (elem, rest) -> reverse' rest  (lazy (StreamCell (elem, newStream)))
  in reverse' stream (lazy Nil)
;;

Conclusion

In this post we saw that OCaml is not lazy evaluated but we can rely on the Lazy module to accomplish that. We also learned a new data structure, stream, which is recursively lazily evaluated and operations like concat and take play well with laziness, while other like drop and reverse do not.

The full implementation with comments is available on github.

References

[1] OCaml Module Lazy
[2] cyocum – Mutually Recursive Types
[3] Implementing lazy from scratch in OCaml


US as an hexagonal map

November 4, 2016

In this post we’ll study a way to visualize maps in a hexagonal grid, in which each entity have uniform area. We’ll then model that as a mathematical problem.

One challenge in displaying data in maps is that larger area countries or states tend to get more attention than smaller ones, even when economically or population-wise, the smaller state is more relevant (e.g. New Jersey vs. Alaska). One idea is to normalize the areas of all the states by using symbols such as squares. Recently I ran into a NPR map that used hexagons and it looked very neat, so I decided to try building it in D3 and perform some analysis.

Below is the result of plotting the state populations (log scale):

US Hexmap: Population (log scale)

US Hexmap: Population (log scale)

One important property of visualizing data in maps is familiarity of the location (you can easily find specific states because you remember where they are) and also adjacency patterns can provide insights. For example, if we plot a measure as a choropleth map and see that the West coast is colored differently from the Midwest, then we gain an insight we wouldn’t have by looking at a column chart for example.

Because of this, ideally the homogeneous area maps should preserve adjacencies as much as possible. With that in mind, we can come up with a similarity score. Let X be the set of pairs of states that share a border in the actual US map. Now, let Y be the set of pairs of states that share a border in the hexagonal map (that is, two hexagons sharing a side). The similarity score is the size of their symmetric difference and we can normalize by the size of the original:

(|X - Y| + |Y - X|) / |X|

The lower the score the better. In an ideal case, the borders sets would match perfectly for a score of 0.

The size of the symmetric difference between the two sets seems like a good measure for similarity, but I’m not sure about the normalization factor. I initially picked the size of the union of X and Y, but this wouldn’t let us model this problem as a linear programming model as we’ll see next. The problem with using the size of X is that the score could theoretically be larger than 1, but it’s trivial to place the hexagons in the grid in such a way that none of them are touching and thus Y is empty, so we can assume the score is between 0 and 1.

Hexgrid coordinates convention

Hexgrid coordinates convention

The score from the NPR maps is 0.67.

An Optimization Problem

Let’s generalize and formalize the problem above as follows: given a graph G = (V,E), and another graph H = (V_H, E_H) representing our grid, find the induced subgraph of H, I = (V_I, E_I), such that there’s bijection f: V \rightarrow V_I and the size of the symmetric difference of f(E) and E_I is minimized (f(E) is an abuse of notation, but it means applying the bijection to each vertex in the set of edges E).

To make it clearer, let’s apply the definition above to the original problem. G represents the adjacency of states in the original map. V is the set of states and E is the set of pairs of states that share a border. H is the hexagonal grid. V_H is the set of all hexagons and E_H is the set of pairs of hexagons that are adjacent. We want to find a subset of the hexagons where we can place each of the states (hence the bijection from states to hexagons) and if two hexagons are in the grid, and we place two states there, we consider the states to be adjacent, hence the need for an induced graph, so the adjacency in the grid is preserved.

Is this general case an NP-hard problem? We can reduce the Graph Isomorphism problem to this one. It consists in deciding whether two graphs A and B are isomorphic. If we set G = A and H = B, then A and B are isomorphic if and only if I = H and the symmetric difference of f(E) and E_I is 0. The problem is that it’s not known whether Graph Isomorphism belongs to NP-Complete.

What if G is planar (which is the case for maps)? I haven’t given much thought about it, but I decided to come up with an integer programming model nevertheless.

An Integer Linear Programming Model

Note: the model uses the original grid analogy instead of the more general problem so that the constraints are easier to understand.

Boolean algebra as linear constraints

Before we start, we need to recall how to model logical constraints (AND, OR and EQUAL) using linear constraints. Let a and b be binary variables. We want x to be the result of logical operators applied to a and b.

For AND, we can do (x = 1 if and only if a = 1 and b = 1)

x \le a
x \le b
x \ge a + b - 1

For OR, we can do (x = 0 if and only if a = 0 and b = 0)

x \ge a
x \ge b
x \le a + b

For EQUAL, we can do (x = 1 if and only if a = b)

x \le 1 - (a - b)
x \le 1 - (b - a)
x \ge a + b - 1
x \ge -(a + b - 1)

We can introduce a notation and assume these constraints can be generated by a function. For example, if we say
x = \mbox{EQ}(a, b), we’re talking about the four constraints we defined above for modeling EQUAL. This is discussed in [2].

Constants

Let G be the set of pairs (x,y) representing the grid positions. Let P be the set of pieces p that have to be placed in the grid. Let N(x,y) be the set of pairs (x',y') that are adjacent to (x, y) in the grid.

Let A_{v1, v2} represent whether v1 and v2 are adjacent to each other in the dataset.

“Physical” constraints

Let b_{x,y,s} be a binary variable, and equals 1 if and only if state s is placed position (x, y).

1) A piece has to be placed in exactly one spot in the grid:

\sum_{(x,y) \in G} b_{x,y,p} = 1 for all p \in P

2) A spot can only be occupied by at most one state:

\sum_s b_{x,y,s} \le 1 for all (x,y) \in G

Adjacency constraints

Let a_{p1, p2, x, y} be a binary variable and equals 1 if and only if piece p1 is placed in (x, y) and adjacent to p2 in the grid.

3) a_{p1, p2, x, y} has to be 0 if p1 is not in (x,y) or p2 is not adjacent to any of (x,y) neighbors:

a_{p1, p2, x, y} = \mbox{AND}(\sum_{(x', y') \in N(x, y)} b_{x', y', p2}, b_{x,y,p})

We have that a_{p1, p2, x, y} is 1 if and only if p1 is in (x,y) and p2 is adjacent to any of (x,y) neighbors.

Finally, we can model the adjacency between two pieces p1 and p2. Let a_{p1, p2} be a binary variable and equals 1 if and only if p1 and p2 are adjacent in the grid:

a_{p1, p2} = \sum_{(x,y) in G} a_{p1, p2, x, y}

Symmetric difference constraints

Let y_{p1, p2} be a binary variable and equals to 1 if and only if a_{p1, p2} \ne A_{p1, p2}.

4) y_{p1, p2} \ge a_{p1, p2} - A_{p1, p2}
5) y_{p1, p2} \ge A_{p1, p2} - a_{p1, p2}

Objective function

The sum of all y‘s is the size of the symmetric difference:

\min \sum_{p1, p2 \in P} y_{p1, p2}.

Practical concerns

This model can be quite big. For our US map example, we have |P| = 50 and we need to estimate the size of the grid. A 50×50 grid is enough for any type of arrangement. The problem is that the number of variables a_{p1, p2, x, y} is |P|^2|G| = 50^4 which is not practical.

We can also solve the problem for individual connected components in the original graph and it’s trivial construct the optimal solution from each optimal sub-solution. This doesn’t help much in our example, since only Hawaii and Alaska are disconnected, so we have |P| = 48. The grid could also be reduced. It’s very unlikely that an optimal solution would be a straight line. In the NPR map, the grid is 8×12. Sounds like doubling these dimensions would give the optimal solution enough room, so |G| = 8*12*4 = 384.

We can also assume states are orderer and we only have variables a_{p1, p2, x, y} for p1 < p2, so the number of a_{p1, p2, x, y} is about 450K. Still too large, unfortunately.

Another important optimization we can do in our case because we're working with a grid is to define the adjacency for x and y independently and combine them afterwards.

Refined adjacency constraints

Instead of working with b_{x,y,s} we use X_{x, s}, and equals 1 if and only if state s is placed position (x, y) for any y and Y_{y, s}, which equals 1 iff state s is placed position (x, y) for any x. The physical constraints are analogous to the previous model:

6) A piece has to be placed in exactly one spot in the grid:

\sum_{x \in G} X_{x,p} = 1 for all p \in P
\sum_{y \in G} Y_{y,p} = 1 for all p \in P

7) A spot can only be occupied by at most one state:

\sum_s X_{xs} \le 1 for all x \in G
\sum_s Y_{y,s} \le 1 for all y \in G

In a hexagonal grid, if we have the piece p1 in position (x,y), it will be adjacent to another piece p2 if and only if p2 is in one of these six positions: 1: (x-1, y), 2: (x+1, y), 3: (x-1, y-1), 4: (x, y-1), 5: (x-1, y+1) or 6: (x, y+1). We can define two adjacency categories: Type I, which happens when p1.y - p2.y = 0 and |p1.x - p2.x| = 1 (cases 1 and 2); and Type II, which is when |p1.y - p2.y| = 1 and p1.x - p2.x \le 0 (cases 3, 4, 5 and 6).

Let’s define Y_{d=0, p1, p2, y} = 1 iff p1.y - p2.y = 0 for a given y. Similarly we define X_{|d|=1, p1, p2, x} = 1 iff |p1.x - p2.x| = 1, Y_{|d|=1, p1, p2, y} = 1 iff |p1.y - p2.y| = 1 and finally X_{d \ge 0, p1, p2, x} = 1 iff p1.x - p2.x \ge 0.

8) We can have the following constraints do model the variables we just defined:

Y_{d=0, p1, p2, y} = \mbox{EQ}(Y_{y, p_1}, Y_{y, p2})
X_{|d|=1, p1, p2, x} = \mbox{EQ}(X_{x, p1}, X_{x-1, p2} + X_{x+1, p2})
Y_{|d|=1, p1, p2, y} = \mbox{EQ}(Y_{y, p1}, Y_{y-1, p2} + Y_{y+1, p2})
X_{d \ge 0, p1, p2, x} = \mbox{EQ}(X_{x, p1}, X_{x, p2} + X_{x+1, p2})

9) Let Y_{d=0, p1, p2} = 1 iff p1.x - p2.y = 0 for any y. We can define analogous variables for the other cases:

Y_{d=0, p1, p2} = \sum_{y} Y_{d=0, p1, p2, y}
X_{|d|=1, p1, p2} = \sum_{x} X_{d=0, p1, p2, x}
Y_{|d|=1, p1, p2} = \sum_{y} Y_{d=0, p1, p2, y}
X_{d \ge 0, p1, p2} = \sum_{x} Y_{d \ge 0, p1, p2, x}

10) Let T'_{p1, p2} = 1 iff p1 and p2 have the Type I adjacency and T''_{p1, p2} = 1 iff p1 and p2 have Type II adjacency:

T'_{p1, p2} = \mbox{AND}(Y_{d=0, p1, p2}, X_{|d|=1, p1, p2})
T''_{p1, p2} = \mbox{AND}(Y_{|d|=1, p1, p2}, X_{d \ge 0, p1, p2})

11) Finally, we say that p1 and p2 are adjacency iff either Type I or Type II occurs:

a_{p1, p2} = \mbox{OR}(T'_{p1, p2}, T''_{p1, p2})

The model for adjacency became much more complicated but we were able to reduce the number of adjacency variables are now roughly O(|P|^2 \sqrt{|G|}). The number of non-zero entries in the right hand size of (which represents the size of the sparse matrix) is roughly 11M, dominated by the type (8) constraints. I’m still not confident this model will run, so I’ll punt on implementing it for now.

Conclusion

In this post we explored a different way to visualize the US states map. I was mainly exploring the idea of how good of an approximation this layout is and a natural question was how to model this as an optimization problem. Turns out that if we model it using graphs, the problem definition is pretty simple and happens to be a more general version of the Graph Isomorphism problem.

I struggled with coming up with an integer programming model and couldn’t find one with a manageable size, but it was a fun exercise!

World Map?

One cannot help wondering if we can display the countries in a hexagonal map. I’m planning to explore this idea in a future post. The main challenge is that the US states areas are more uniform than the countries. For example, the largest state (Alaska) is 430 times larger than the smallest (Rhode Island). While the largest country (Russia) is almost 40,000,000 bigger than the smallest (Vatican City).

Also, the layout of the US map was devised by someone from NPR and they did a manual process. I’m wondering if we can come up with a simple heuristic to place the hexagons and then perform manual adjustments.

References

[1] NPR Visuals Team – Let’s Tesselate: Hexagons For Tile Grid Maps
[2] Computer Science: Express boolean logic operations in zero-one integer linear programming (ILP)
[3] SOM – Creating hexagonal heatmaps with D3.js
[4] Github – d3/d3-hexbin

Data sources

[5] US State Borders
[6] Wikipedia – Population of US states and territories
[7] Tool to download Wikipedia tables as CSV
[8] List of US states and territories by area


Persistent Data Structures

October 22, 2016

Book

In this post we’ll discuss some basic data structures in functional programming and their implementation in OCaml. Our main reference are the first two chapters of Purely Functional Data Structures by Chris Okasaki.

The first chapter explains persistent data structures and the second chapter provides examples of three common data structures in imperative programming and how to implement them in ML (a functional language).

Persistent Data Structures

The core idea behind designing data structures in functional languages is that they’re immutable, that is, if you want to modify it, a new copy has to be made.

If we want to modify a data structure in the naive way, we would need to clone the entire structure with the modifications. Instead, this is done in a smart way so different versions of the data structure share them. Two examples are given to illustrate this idea: lists and binary trees.

Immutable Linked Lists

Lists are a central native data structure in functional languages. In Ocaml, they’re implemented as linked list. From our data structure class, we know that inserting an element at the beginning of a list is an O(1) operation, but this requires modifying pointers in the data structure. As we said data structures are immutable, so we need to make a copy to perform this operation.

Note that we only need to make the new node point to the beginning of the original list, which means we don’t have to clone this list, we just reuse it.

Now, if we’re inserting in the middle of the list will require copying the existing nodes until we reach our newly inserted element, which we can then point to the remaining of the original list. Consider the example below, where we insert the element 4 in the second position of the list.

Inserting a new element (4) in a linked list

Inserting a new element (4) in a linked list

Note that inserting an element at the beginning of a linked list is much more efficient than at the end, both in terms of complexity and space. If we were to insert it at the end, we would require full copies and insertion would be O(length of the list). As an experiment, we can write a function that generates a list representing a range of integers in two ways.

In example 1, we create the list by inserting an element at the beginning of the array,

let rec list_range_fast start_range end_range =
  if start_range == end_range then []
  else start_range :: (list_range_fast (start_range + 1) end_range)
;;

In example 2 we do it at the end:

let rec list_range_slow start_range end_range =
  if start_range == end_range then []
  else (list_range_slow start_range (end_range - 1)) @ [end_range]
;;

If I run the slow version with start_range = 0 and end_range = 50000, it takes over a minute to run on my computer, while the fast version runs in a few milliseconds.

Immutable Binary Trees

Binary trees are a generalized version of a linked list and it works the same way. The key point to notice is that when inserting an element in a (binary) tree only the nodes in the path from the root to the inserted node need to have their pointers modified, so we need to clone a number of nodes that are at most the height of the tree. See the example below:

Inserting an element in a binary tree

Inserting an element in a binary tree. Green nodes are the new ones

With this basic concept understood for simple constructs like linked lists and binary trees, we can construct basic data structures on top of them: heaps and balanced binary trees.

Leftist Heaps

A (minimum) leftist heap is a binary tree with the following properties:

1. The value of a node is smaller or equal the value of its children.

Let the spine of a node be the path defined by following the right children until a leaf is found (i.e. the rightmost path), and the rank of a node the length of its spine.

2. The rank of the left child is always greater or equal than the right child.

It’s possible to show that the rank of a heap of n nodes is less or equal than O(floor(log n + 1)).

Insertion. Before talking about inserting an element in a heap, let’s define the more general merge() function, which merges two leftist heaps, A and B, into one. We compare the root of each heap, say A.val and B.val. If A.val <= B.val, we make A.val the root of the new heap, A.left the left of the new heap and the right of the new heap will be the merge of A.right and B. If A.val >= B.val we do an analogous operation.

Note that the result of the merge might violate property 2, since we’re adding a new node to the right subtree. This is easy to fix: we just to swap the left and right subtree when coming back from the recursion. This is what makeTree() does.

Note that we always perform the merges of the right, which means that the number of such merges will be proportional to the largest rank the original heaps, which implies we can do it in O(log(A.size + B.size)).

In the context of immutable structures, this implementation of heap is efficient because the only nodes whose pointers we need to mutate (even when a swap is needed) are on the spine of the trees, so it only needs to create O(log n) new nodes.

The insertion process consists in creating a heap with one element and merging into the target heap.

Returning the minimum element is trivial: we just need to return the element at the root. Removing the minimum is easy as well: just merge the left and right subtree using merge().

My implementation for the leftist heap is on github.

Binomial Heaps

Binomial heaps are an alternative heap implementation. Before defining a binomial heap, let’s introduce the binomial tree. A binomial tree is a can be defined recursively based on a property called rank. A single node is a binomial tree of rank 0. A tree of rank r > 0, is formed by combining two trees of rank r-1 making one tree the leftmost child of the other. We denote this operation as linking.

Binomial Heap

Examples of binomial trees of different ranks

A binomial heap is a list of binomial trees ordered by increasing rank, with no trees with the same rank.

Insertion. Before talking about insertion of an element into a heap, let’s see how to insert a binomial tree of rank r into the heap. To keep the order of the rank of the tree list, we need to traverse through it to find the right place to insert it. Since we can’t have two trees with the same rank, if we run into this case we need to merge those two trees into one with rank r + 1. This can cascade so we might need to repeat this process with the new tree.

Linking a tree is a direct application of its definition. We just need to decide which of the trees will become child of the other. Since we want a minimum heap, we need to guarantee that the minimum element is always on top, so we always make the tree with smallest root to be on top.

Linking is a constant time operation, since we only need to update the pointer of the root of the top tree, which also means we only need to clone the root node to generate a new immutable binomial tree.

For the complexity of traversing the heap list, we can first node that a heap of rank r has 2^r nodes. Which means that a heap of n elements has at most log(n) trees. In fact, a neat way to represent a binomial heap of size n is by its binary representation. For example, 10(dec) = 1010(bin), so for a heap of size 10, we should have a list of trees with ranks 1 and 3. This shows that we can traverse a list in O(log n) and in the worst case we might need to clone this many number of nodes.

Returning the minimum element requires us to scan the list of trees, because even though we know the minimum element is the root of a tree, we don’t know which tree it is, so this operation is O(log n). For removing this element, we can define an auxiliary function, removeMinTree(), which removes the tree with the minimum element from the tree list. Since we only want to remove the root of this tree, we need to re-insert the subtrees back to the heap.

One key observation is that, in a binomial tree of rank r, the children of the root are also binomial trees of ranks from 0 to r-1, which form a binomial heap. We can then define a merge() function that merges two heaps using an idea similar to a merge sort. If we refer back to the analogy of the binary representation of the heap size, the merge operation is analogous to adding two binary numbers!

My implementation for the binomial heap is on github.

Red Black Trees

A red-black tree is a binary seach tree in which every node is either Red or Black. It respects the following invariants:

1. No red node has a red child
2. Every path from the root to an empty node has the same number of black nodes

Property. the height of a red-black tree of n nodes is at most 2*floor(log(n + 1))

Proof sketch. If the tree has a path to an empty node of length greater than 2*floor(log(n + 1)), this must have more than floor(log(n + 1)) black nodes in the path because we cannot have two consecutive red nodes (invariant 1). Now, by removing all the red nodes, we must have a complete tree of height >= floor(log(n) + 1) + 1, which means 2^(floor(log(n + 1)) + 1) - 1 which is greater than n.

Membership. Since a Red-Black tree is a binary search tree, search can be done in O(height of the tree) which is O(log n) by the property above.

Insertion. inserting an element in a Red Black tree is similar to inserting it into a binary search tree. The challenge is that it may violate one of the invariants. To avoid that we must perform re-balancing along the path we follow to insert the node in the tree.

We always color the inserted node Red. This doesn’t violate the Black nodes constraint, but it might violate the Red nodes one, in case its parent is also Red. If that’s the grandparent of the inserted is necessarily Black. We now have 4 possible scenarios, depicted at the top, right, bottom and left trees:

Unbalanced Red-Black trees and the result of the balancing operation

Unbalanced Red-Black trees and the result of the balancing operation

We assume that the subtrees a, b, c and d are balanced. For all these 4 cases we can perform a “rotation” to achieve the tree at the center.

The Black nodes constraint is not violated, because the path from the root still has the same number of Black nodes as before, and we fixed the consecutive Reds. Notice that y might violate the Red node constraints with its parent, so we need to do it recursively.

In terms of complexity, the insertion can be done in O(log n) and rebalancing takes a constant amount of time. Regarding the immutable structure, notice we only need to change nodes from the insertion path, so only O(log n) nodes have to be cloned.

My implementation for the Red Black tree in Ocaml is on github.

Conclusion

The first two chapters from this book are really interesting. I have seen the binomial heap and Red-Black trees before in data structure classes and also implemented some data structures such as AVL trees in functional programming in the past (Lisp) during college.

I wasn’t aware of the immutability of data in functional programming until much later, when I was learning Haskell. Okasaki first introduces this concept in Chapter 2, so it allow us keeping that in mind when studying the implementation of functional data structures.

He doesn’t make it explicit on Chapter 3 that the data structures presented are efficient in terms of extra memory necessary to clone them, but they are easy to see.

The ML syntax is very similar to OCaml, but it was a good exercise implementing the code on my own. I tried making it more readable with comments and longer variable names. This also led me to learn a few OCaml constructs and libraries, including:

* How to perform assertions (See the binomial heap implementation)
* Unit testing (using OUnit2 library)
* The “or matching” pattern (see the balance() function in red_black_tree.ml)


Review: The Design of Everyday Things

September 5, 2016

don-norman

Don Norman is the director of The Design Lab at University of California, San Diego.

In The Design of Everyday Things, he discusses human psychology, introduces many concepts around design and provides suggestions to improve the usability of products. He takes into account practical real world challenges, such as time and budgets constraints during development.

The book is divided into seven chapters which we’ll summarize in this short post.

1. The psychopathology of everyday things

This first chapter focus on attributes of products that influence its usability. It introduces concepts such as affordances, mapping and feedback that improve usability. Affordances help people figure out what actions are possible without the need for labels or instructions. These are relationships (between human and object), not (object) properties.

Affordance make this obvious this side is to be pushed. The asymmetric bar suggest which side of the door to press.

Affordance make this obvious this side is to be pushed. The asymmetric bar suggest which side of the door to press.

Sometimes it’s not possible to make actions obvious, in which case we need signifiers to help with it. Signifiers include messages, symbols and legends.

Signifiers, labels in this case, aid users in deciding whether to pull or push.

Signifiers, labels in this case, aid users in deciding whether to pull or push.

Mapping is useful when the controls the human interacts with are not in the same place as the object controlled. A common example is a switch and light. When there are many lamps to control using physical correspondence between them make it easier to find out which switch controls each light.

Physical distribution of switches location maps to actual lights location.

Physical distribution of switches location maps to actual lights location.

My first reaction on the switch above is that it looks ugly and cluttered. One message I got from the book is that good design is not necessarily beautiful and minimal – sometimes they’re conflicting even, because they might hide affordances and signifiers.

Feedback is communicating the result of an action immediately. This includes turning the light on the elevator button when it has been pressed or in web design depressing a button and disabling it temporarily (if the result cannot be returned immediately).

Conceptual model is the ability for the user to keep a simplified version of the system in their mind, often relating to an existing product. One example is the use of terms like Desktop, Folders and Files in the GUI of an operating system, relying on the existing model of organization from an office.

One example of bad conceptual model is the heater/oven regulated by a thermostat. If you want to pre-heat the oven quicker, one natural idea is to put the temperature to the maximum and then lower it down when it’s ready. The problem is that this is not how thermostat ovens work. They have a heater providing a constant flow of heat, and they control the temperature by turning it on and off. The longer you leave it on, the higher the temperature gets, but it make it reach that temperature faster.

2. The psychology of everyday actions

This chapter focus on the user side, more specifically, what goes in their head when interacting with a product. He proposes breaking down an action into stages.

Stages of an action

Stages of an action

He discusses levels of processing: visceral (instinct), behavioral (habit) and reflective (conscious). In the picture above, the stages are aligned by these levels. Intention and evaluation are both at the conscious level, plan and interpretation at the behavioral, and finally execution and perception are visceral.

Users blame themselves. Humans are usually eager in blaming other people day to day, but when interaction with machines, they often blame themselves, but the confusion is caused by a bad design.

3. Knowledge in the head and in the world

This chapter focuses on how we use knowledge to interact with a product. He categorizes knowledge into two: knowledge in the head (memory) and knowledge in the world (conventions, standards).

Delving into the workings of memory he talks about short term vs long term memory and how short memory can only keep a few item “on cache” (using a computer analogy). The author mentions how constraints help remembering things, such as why it’s easier to remember poems vs. prose, because has a more rigid structure. He brings back ideas from Chapter 1, like conceptual models and mapping, which reduces the amount of things to remember.

Regarding knowledge in the world, a lot of conventions vary according to culture or country (e.g. which side of the road to drive on), which must be taken into account especially when developing systems available internationally.

Systems should rely more on knowledge in the world than in the head. Some systems rely on knowledge in the head on purpose, often for security reasons, for example reliance on passwords.

4. Knowing what to do: constraints, discoverability and feedback

This chapter focuses on how the product can help users to interact with it by limiting the universe of possible actions (constraints), making it easy to discover the right way to use it (discoverability) and providing feedback information along the way to tell users whether they’re using it correctly.

He categorizes constraints into four types: physical, cultural, semantic (derived from the purpose of the action) and logical (for example: there’s only one logical way to perform an action).

For discoverability the author analyzes the design of faucets, which have to make it easy for users to control water flow and temperature.

For feedback, he discusses the pros (does not require focused attention) and cons (annoyance, surrounding noise) of using sound as feedback.

5. Human error? No, bad design

In this chapter, the author focuses on user errors. He categorizes them into slips (execution error) and mistakes (planning error). Slips are easier to detect because they are a deviation of the expected plan, while mistakes might be executing correctly but the wrong plan.

He suggests designing for errors. This includes preventing errors in the first place (constraints), sensibility checks (e.g. input validation), the option to undo actions, make error obvious and easy to correct.

6. Design thinking

This chapter provides a framework for the process of designing. It includes the double diamond: the first diamond tries to find/define the problem, while the second is to find the solution.

The analogy with the diamond shape is that in both phases it starts by expanding the range of ideas and then narrowing down to specific ones. More technically, he defines four phases in each of the diamonds:

1. Observation
2. Idea generation
3. Prototyping
4. Testing

Observation requires a deep understanding of a small set of customers (as opposite to other forms of observations such as large-scale general A/B testing).

Idea generation is basically brainstorming. This, with the prototyping and testing should be an interactive process.

In the rest of the chapter the author discusses related topics of designing, how external factors influence the design process (budget and time constraints), the fact that the buyer might not be the end user (e.g. appliances for a rental place) and how making something harder to use might be desirable (such as to improve security and provide access control).

7. Design in the world of business

In this final chapter, the author focus on real world design. Besides the budget and time constraints, one source of bloated design is the featuritis that arises from competition. If the competitor of a product adds a new feature, it has to follow suit and add it too.

Another challenge with design, arises from the fact that people don’t like changes. Improving the design or introducing a new technology sometimes doesn’t take off until much later when people start getting used to it and adopting it. Around this theme, we discusses the tradeoffs of incremental and radical innovations, and argues that both are important for the development of products.

Conclusion

the-design-of-everyday-things

My impressions: I did like that the book uses consistent terminology to explain concepts and that the author provides a lot of examples. I also like the fact that he come up with conceptual models, defining relationships between different concepts, such as the stages of an action.

I didn’t think the book was very organized. He does mention the book doesn’t have to be consumed linearly, but I did feel that the book was a collection of topics around a theme instead of a cohesive text. I’m used to technical books where you look at the table of contents and how the small parts (chapters) usually have well defined boundaries and how they assemble together to form the big picture.

Most of my work consists in developing Web interfaces for people to do their jobs better. Usability is a very important concept in this field, so I’m eager to learn more about this subject.

Thoughts: Usability of code

In light of a recent read, Code Complete 2, I’ve been constantly aware of the usability (readability) of source code. If we think about it, it shares similar challenges with end products and maybe it’s possible to leverage ideas from this book and apply them to coding.

Some analogies: good function names are affordances on how to use a function, sticking to code conventions are a good way to move knowledge from the head to the world (Chapter 3), comments can act as signifiers, invariants and unit-tests can act as constraints that convey the expected behavior of a function. Conceptual models are achieve by using good abstraction that maps intuitively to the business rules the code is aimed to implement.

As emphasized in the book, we write code for people, not for machines, so there’s no reason to not strive to make them as useful as products we interface with every day.


Web Workers

August 4, 2016

In this post we study Web Workers, a technology that allows JavaScript code to run in separate threads. We’ll start by exploring the API with some toy examples and at the end discuss some applications.

Introduction

By default JavaScript runs in a single (the main thread), which can be a problem for user experience if expensive operations need to be performed in code which would affect responsiveness of the UI.

The thread that runs the web worker code has some environment limitations, which includes no access to the DOM or global objects such as window. [3] contains a list of functions and methods available to a Web Worker.

Besides that, memory is not shared between the threads, having to be explicitly serialized (so it can be cloned) and passed via a method (postMessage()). This can lead to performance issues if the amount of data to be copied is large. In Transferable Objects we’ll discuss an alternative.

Workers might spawn their own workers.

For some reason, the term "web workers" reminds me of these creatures from Spirited Away

For some reason, the term “web workers” reminds me of these creatures from Spirited Away :)

Let’s work a simple example. Imagine we have two files, main.js and worker.js (in the same directory):

main.js:

// Initializes the worker with the
// JavaScript code in worker.js
var myWorker = new Worker("worker.js");

// Post a message to the worker
myWorker.postMessage("Hello worker!");

// Callback that gets called when a 
// message is received from the worker.
myWorker.onmessage = function(/*MessageEvent*/ message) {
  console.log(
    'Message received from worker script: ', 
    message.data
  );
}

worker.js

onmessage = function(/*MessageEvent*/ message) {
  console.log(
    'Message received from main script: ', 
    message.data
  );
  postMessage("Hello main!");
}

Transferable Objects

By default data is copied when sending information back and forth between the main thread and the worker, using a process called structural cloning.

The serialization/deserialization can be expensive, for which case there is an alternative: transferrable objects. More specifically, we can work with ArrayBuffers, which can be “transferred” instead of cloned, which is a more performant operation, more precisely, O(1).

We’ll first cover ArrayBuffers and then see how to apply it in a context of Web Workers.

ArrayBuffer

According to [5]:

The ArrayBuffer is a data type that is used to represent a generic, fixed-length binary data buffer. You can’t directly manipulate the contents of an ArrayBuffer; instead, you create a typed array view or a DataView which represents the buffer in a specific format, and use that to read and write the contents of the buffer.

ArrayBuffer basically represents an unstructured array of bits, which, to have any meaning/interpretation, needs a view, which can be an array of 32-bit unsigned ints or 16-bit unsigned ints. In the example below we create an array buffer of 100 bytes.


// Number of bytes or number of elements
var buffer = new ArrayBuffer(100);

// A 32-bit unsigned int array of length 10 (i.e. 40 bytes), starting 
// from byte 0 at the array buffer
var int32View = new Uint32Array(buffer, 0, 10);
// A 16-bit unsigned int array of length 20 (i.e. 40 bytes), starting 
// from byte 0 at the array buffer
var int16View = new Uint16Array(buffer, 0, 20);

// Fill in the 16-bit array with 0, 1, 2...
for (var i = 0; i < int16View.length; i++) {
  int16View[i] = i;
}

// The memory is shared because we're reading from the same chunk of 
// the byte array.
for (var i = 0; i < int32View.length; i++) {
  console.log("Entry " + i + ": " + int32View[i]);
}

This is a very interesting model. In ArrayBuffers one explicitly work with the serialized form of the data and create views on top of them. I’m used to work with the views-first, that is, create a class representing some data and eventually add serialization/deserialization methods. One advantage of working with serialized data is that we don’t need to write the serialization methods, only the deserialization. The major disadvantage is that you need to know upfront how much memory you’ll have to use.

Transferrable objects

We can extend the example above to be used between a worker and the main thread.

worker.js:

var buffer = new ArrayBuffer(100);
var int16View = new Uint16Array(buffer, 0, 20);

for (var i = 0; i < int16View.length; i++) {
  int16View[i] = i;
}

console.log('array buffer size', buffer.byteLength); // 100
postMessage(buffer, [buffer]);
console.log('array buffer size?', buffer.byteLength); // 0

and in the main.js:

...
myWorker.onmessage = function(e) {
  buffer = e.data;
  // Number of bytes or number of elements
  var int32View = new Uint32Array(buffer, 0, 10);

  for (var i = 0; i < int32View.length; i++) {
    console.log("Entry " + i + ": " + int32View[i]);
  }
}

By logging the output to the console, we can see the main thread received the values written to the array buffer by the worker and after the worker transferred the buffer data, it was emptied out.

Note in the postMessage() API, we provide buffer as a the first parameter and then it also appears in the list indicating it should be transferred, not copied. Having to pass it twice is a bit confusing in my opinion, but this is to allow the example below, in which the objects transferred are nested inside another structure (in this case an object) and we want to transfer both buffer1 and buffer2 but not the top-level object. I’m not sure which use case the API designers had in mind, though.

postMessage(
  {'key1': buffer1, 'key2': buffer2}, 
  [buffer1, buffer2]
);

Error Handling

If any errors are uncaught by the worker, it can be caught from the main thread through the onerror callback:

myWorker.onerror = function(e) {
  console.log('an error occurred:', e);
  e.preventDefault();
}

Where e is an instance of ErrorEvent. We can simulate an error on the worker.js code:

throw new Error("Some error occurred");

Terminating

The main thread can terminate the worker

worker.terminate();

or the worker can terminate itself:

close();

Applications

A lot of the examples using Web Workers involve doing some fake expensive calculation in the worker thread, but I haven’t found any real-world application.

StackOverflow offers some ideas, including one that is dimissed as bad uses of Web Workers (polling) or from projects that are long defunct (Mozilla Skywriter). The main issue is that most of time heavy processing is done on the server.

One idea that came to mind is to use web-workers in React. React defers a lot of DOM work to the end by working with the concept of a virtual DOM. Web-workers don’t have access to the DOM but they do have it for virtual DOMs. Turns out this idea has been explored already [7, 8] but there were some technical difficulties in implementing events.

Conclusion

In this post we studied Web Workers and some examples utilizing it. I learned a few other related topics like ArrayBuffers, and Compositor Workers. I was a bit disappointed with the lack of compelling applications using Web Workers. I’ll try it out in some of my projects and see if I can any benefits from it.

References

Some of the code presented in this post is available on Github.

[1] MDN – Using Web Workers
[2] HTML5 Rocks – The Basics of Web Workers
[3] MDN – Functions and classes available to Web Workers
[4] Google Developers – Transferable Objects: Lightning Fast!
[5] MDN – JavaScript typed arrays
[6] StackOverflow – What are the use-cases for Web Workers?
[7] React Custom Renderer using Web Workers
[8] GibHub React Issues: Webworkers #3092
[9] Compositor Workers Proposal


OCaml Modules

July 17, 2016

camel

One prevalent syntax in some OCaml code I’ve encountered is about modules, so I decided to study them a little bit to be able to better understand the code I’ve been reading.

This post is based on Chapters 4 and 9 from Real World OCaml [1, 2], which we started in the last post.

Defining modules

By default, a file defines a module. Module names are derived from filenames and is always capitalized (even if the filename is not). Let’s define a toy example:

myModule.ml:

let a = 1;;

main.ml:

Printf.printf "%d\n" MyModule.a;;

We can now compile using the built-in ocamlc (make sure to follow the setup here):

ocamlc myModule.ml main.ml

Note that the order is important. Since main.ml depends on myModule.ml, it has to be included first. In case the files do not live in the same directory, we can use the -I option. For example, if myModule was in another_dir, we could compile using:

ocamlc -I another_dir/ another_dir/myModule.ml main.ml

A module has two parts: the definition and the signature. A module defined by a file filename.ml can be constrained by a signature in a file called filename.mli. The definition (mli) should be included in the compiling step and appear before the definition. For example, we could have

myModule.mli:

val a : int;;

and compile using:

ocamlc myModule.mli myModule.ml main.ml

Submodules

It’s possible to define multiple modules inside a file through submodules. We can add this to myModule.ml:

module MySubmodule : sig
  val inc : int -> int
end = struct
  let inc x = x + 1
end

and in main.ml:

Printf.printf "%d\n" (MyModule.MySubmodule.inc 1);;

Note that we still need to provide the module name defined by the filename. The first part of the definition is the type signature of the module and the second the definition. The general syntax is:

module <module name> : sig
  type <abstract type name>
  val <name> : <type signature>
end = struct
  type <abstract type name> = <concrete type>
  let <name> <definition>

Alternatively, we can separate the type definition from the implementation. In that case, we create a separate file with extension .mli containing the interface

myModule.mli:

module MySubmodule : sig
  val inc : int -> int
end

and in myModule.ml:

module MySubmodule = struct
  let inc x = x + 1
end

In general, the syntax for the signature is:

module type <module type name> : sig
  type <abstract type name>
  val <name> : <type signature>
end

and for the module definition is:

module <module name> : <module signature> = struct
  type <abstract type name> = <concrete type>
  let <name> <definition>

Including modules

Modules are made available when linking during compile time, but if we want to use a function from a module, we still need to qualify it. We can alternatively include them explicitly to avoid qualifying module names (similar to use namespace in C++).

open Module
foo

Instead of:

Module.foo

We can also invoke open inline:

# let average x y =
    let open Int64 in
    x + y / of_int 2;;
val average : int64 -> int64 -> int64 = <fun>

or alias the module to a shorter name with the let module construct:

let print_median m =
  let module C = Counter in
  match m with
  | C.Median string -> printf "True median:\n   %s\n" string
  | C.Before_and_after (before, after) ->
    printf "Before and after median:\n   %s\n   %s\n" before after

Functors

Functors are functions that transform modules, so it’s a function from a module to a module. A basic example is provided in [2]. First we define a toy signature signature:

module type X_int = sig 
  val x : int 
end;;

Then, we define a functor, which we call Increment:

module Increment (M : X_int) = struct
    let y = M.x + 1
end;;

What tells us this is a function is the extra parameter the module takes (M: X_int). In here, M is the name we give to the module and X_int is its interface. Since the interface tells us M has the x value, we can access it within our function. Note that Increment acts like a function, taking M (module) as parameter and returning another module, defined by the struct block. In this case, the type signature is different because the returned module has y instead of x. We can force the returned type of a function by adding a constraint:

module Increment (M : X_int) : X_int = struct
    let x = M.x + 1
end;;

Now, if we try to use y, we can a compilation error. To fix it, we just change y to x.

Functors cannot be used by themselves. They’re useful for creating modules out of existing modules. For an example, imagine we have a module implementing X_int:

module Three = struct
  let x = 3
end;;

We can create a new module Four, by transforming our Three module:

module Four = Increment(Three);;

// Testing the modules
Printf.printf "%d\n" Three.x;; // 3
Printf.printf "%d\n" Four.x;;  // 4

“Abstract” functors

In [2], the authors provide an example of an MakeInterval module, in which there’s a dependent type. First it creates a Comparable signature:

module type Comparable = sig
  type t
  val compare : t -> t -> int
end;;

to make it shorter (and less “real world”), I’ve created a simpler version here, MakeElement:

module MakeElement (InputModule : Comparable) = struct
  type t = Element of InputModule.t
  let create x = Element x
end;;

we can then create a module:

module IntElement = MakeElement(Int);;

The above works because Int satisfies the constraint defined by the Comparable module signature. The authors make a point here that sticking to standard conventions can improve reuse such as cases like this. We can finally use

# let e = IntElement.create 10;;
val e : IntElement.t = IntElement.Element 10

The authors argue that the IntElement exposes implementation details because Element is “exposed”:

# let e2 = IntElement.Element 10;;
val e2 : IntElement.t = IntElement.Element 10

One solution is to constrain the return type of the functor and not expose Element in the signature. The signature would look like:

module type ElementInterface = sig
  type t
  type element
  val create : element -> t
end;;

and the functor would look like:

module MakeElement (InputModule : Comparable): ElementInterface = struct
  type t = Element of InputModule.t
  let create x = Element x
end;;

The problem is the type element is not bound to anything, so we have to explicitly do it when defining the functor. The construct is the following

module MakeElement (InputModule : Comparable): 
  (ElementInterface with type element = InputModule.t) = struct
  type t = Element of InputModule.t
  type element = InputModule.t
  let create x = Element x
end;;

now MakeElement return a module with interface ElementInterface which doesn’t expose Element.

Destructive Substitution

One problem with the approach above is the redundant binding of type element. One slightly different syntax removes that requirement:

module MakeElement (InputModule : Comparable): 
  (ElementInterface with type element := InputModule.t) = 
struct
  type t = Element of InputModule.t
  let create x = Element x
end;;

We basically changed from = to :=, which is called destructive substitution.

Multiple Interfaces

It’s possible to “extend” more than one module signature when creating a new one. For example:

module type MyFunctor = sig
   include MyModuleInterface
   include MyModuleInterface2 with type t := t
end;;

Conclusion

Modules seems a very powerful concept in Ocaml. Besides organizing files, modules can act as functions and can model concepts from Object Oriented Programming like classes and interfaces.

I’ve been following a different study strategy while learning Ocaml. I try to read some real world code and when I get stuck understand a syntax or a pattern, I can search for them in a book. This makes it more interesting than reading a book cover to cover.

References

[1] Real World OCaml – Chapter 4. Files, Modules, and Programs
[2] Real World OCaml – Chapter 9. Functors
[3] Compiling OCaml projects


Exploring OCaml

June 26, 2016

camel

I’ve been learning the basics of OCaml in order to be able to follow the examples from Purely Functional Data Structures, from Chris Okasaki.

My background is that I have zero knowledge about OCaml, but having studied a bit of Haskell, I’m familiar with the main concepts in functional programming, so I’ll try to focus on syntax + difference between the two languages. I’ll probably skim through concepts from functional programming I’ve already covered previously on past Haskell posts.

Setup

I used to skip the setup from posts, but it might be useful to someone, especially with a similar environment, so I decided to include it here. This setup assumes MacOS and emacs.

Installing

Easily available on Brew:

$ brew install ocaml
$ brew install opam
$ ocaml

For other OS’es: https://ocaml.org/docs/install.html

Emacs syntax highlighting

Looks like tuareg is a popular mode for developing OCaml in emacs:

opam init
opam install tuareg

At the configuration file (~/.emacs.d/init.el) we can add:

(load "~/.opam/system/share/emacs/site-lisp/tuareg-site-file")

OCaml CLI

Typing ocaml in the terminal will bring up the CLI, but it’s not very interactive, not having a good way to go back previous command or edit strings in the middle. I’ve learned about this command line called rlwrap that implements this functionality. We can easily install it on mac:

brew install rlwrap

And invoke ocaml like this:

rlwrap ocaml

We can also add a simple alias to have these features as default:

alias ocaml='rlwrap ocaml'

Language Basics

Statements

Statements boundaries are defined by ;;. Example:

# 1 + 1;;
- : int = 2

This let us define multi-line expressions,

# 1 +
# 1
# ;;
- : int = 2

Comments

(* This is a single-line comment. *)

(* This is a
 * multi-line
 * comment.
 *)

Load code from file

It can become tedious to edit functions in the CLI. It’s possible to execute the contents of a file:

> ocaml
# #use "my-file.ml";;

Basic Types

* int – 31 or 63-bits, depending on the platform one of the bits is used for internal memory management

When writing literal number values, the underscore character is ignored (as long as it’s not the leading character). For example:

# 10_1___02__  + 1;;
- : int = 10103

This can be useful to define large numbers in a more user friendly way:

# let x = 12_434_934
val x : int = 12434934 

* float – IEEE double-precision floating point

OCaml doesn’t do explicit casts, especially between ints and floats. We have to cast using functions like int_of_float or float_of_int. Examples:

# 1 + 1;;
- : int = 2
# 1.3 +. 1.2;;
- : float 2.5
# 1 + 1.0;;
Error: This expression has type float but an expression was expected of type int
# 1 +. 1.
Error: This expression has type int but an expression was expected of type float
# 1 + (int_of_float 1.0)
- : int = 2
# (float_of_int 1) +. 1.
- : float 2.

Note that the sum operator for floats has an extra . character (+.)

* bool – true/false
* char – 8-bit ascii character
* string – more than a list of char, efficient internal representation

Variables

We can define variables by the use of let

# let a = 3 in
  let b = 4 in
  a + b;;
- : int = 3

This looks like imperative code at the first glance, but it’s slightly different. let <expr1> in <expr2>. The expr1 is only made available inside the scope of expr2. For example:

# let a = 3 in
  let b = 4 in
  a + b;;
- : int = 3
# a;;
Error: Unbound value a

Here the variable a was defined only for the expression:

  let b = 4 in
  a + b;;

When we terminated it with ;; , a was out of scope. We can also bind multiple variables in the same expression, for example:

# let a = 3 and let b = 4 in
  a + b;;
- : int = 3

Functions

Defining Functions

Example: A simple sum function

# let sum a b = a + b;;
val sum : int -> int -> int = <fun>
# sum 1 2
- : int = 3

Note how the type signature syntax is very similar to Haskell.

Explicit function type signature

Sometimes to avoid ambiguity, we might want to provide the types for the inputs/outputs of the function. For example, we might want to define a sum function only intended to be used by ints.

let intSum (a: int) (b: int) : int = a + b;;

Lambda functions

Denoted by the use of the fun construct, it’s useful for passing simple functions as parameter (for example to a map over lists).

# let doubleValues ls = 
  List.map (fun x -> 2*x) ls
;;
val doubleValues : int list -> int list = <fun>
# doubleValues [1; 2; 3];;
- : int list = [2; 4; 6]

Recursive functions

A function must be explicitly marked as recursive by add a rec, according to [2], due to technical reasons – mainly related to type inference.

Example: Computing the factorial of a natural number n:

# let rec factorial n =
  if n == 0 then 1
  else n * (factorial (n -1))
;;
val factorial : int -> int = <fun>
# factorial 10;;
- : int = 3628800

Matching patterns

Similar to Haskell, Ocaml has pattern match which we can use to decide which body of function to apply. For example, to invert a list we can do:

# let rec reverseList xs =
  match xs with
  | [] -> []
  | x :: xs -> (invertList xs) @ [x]
;;
# reverseList [1; 2; 3];;
- : int list = [3; 2; 1]

The _ operator indicates all the non-matched cases. For example, we could rewrite the reverseList function as

# let rec reverseList xs =
  match xs with
  | x :: xs -> (invertList xs) @ [x]
  | _ -> []
;;
# reverseList [1; 2; 3];;
- : int list = [3; 2; 1]

Labeled arguments

We can prefix a function parameter with ~ to indicate it’s a labeled (named) argument. Example:

# let div ~a ~b = float a /. float b;;
val div : a:int -> b:int -> float = <fun>
# div 10 2
- : float = 5.
# div ~b:10 ~a:2
- : float : 0.2

Note how the variable name shows up in the function’s type signature. It’s important because we may pass a function with labeled arguments to another function and it may make use of this fact (it’s also useful for documentation).

If the variable name matches the named parameter, we don’t need to repeat ourselves:

# let a = 10
val a : int = 10
# leb b = 2
val b : int = 2
# div ~b ~a

We can also apply currying using named arguments. For example, if we want generate a new function with the value b “bound”, we can do

# let b = 2
val b : int = 2
# let half = div ~b;;
val half : a:int -> float = <fun>
# half ~a
- : float = 0.5

When currying, positional parameters (i.e. non-labeled arguments) are always bound before the labeled ones. For example:

# let makeList3 ~a ~b c = [a; b; c];;
val makeList3 : a:'a -> b:'a -> 'a -> 'a list = <fun>
# let makeList2 = makeList3 1
val makeList3 : a:'a -> b:'a -> 'a list = <fun>
# makeList2 2 3
- : int list = [2; 3; 1]

Optional parameters

Optional parameters are prefixed with ? like sep in the example below:

# let concat ?sep x y =
  let sep = match sep with None -> "" | Some x -> x in
    x ^ sep ^ y
;;
val concat : ?sep:string -> string -> string -> string = <fun>

# concat "ab" "cd";;
- : string = "abbc"
# concat "ab" "bc" ~sep:",";;
- : string = "ab,cd"

The value coming from an optional parameter is either None or Some x. An optional parameter is also a labeled parameter.

In the example above, we use pattern matching to provide a default value to sep. There’s a shorter syntax for this:

let concat ?(sep=" ") x y =
  x ^ sep ^ y
;;
val concat : ?sep:string -> string -> string -> string = <fun>

By providing a default value, the value in the sep variable won’t be either None/Some, but the actual type expected, in this case a string.

It can be tricky to apply curry in the presence of optional arguments. [2] discusses in detail the heuristics applied by the compiler in different scenarios.

Conclusion

In this post we covered the very minimum to get our hands dirty with some OCaml code and learn the basic syntax. I don’t plan to study [2] for now. I’m mostly interested in learning enough to follow along Purely Functional Data Structures.

Some first impressions: the ocaml CLI is pretty limited, but thanks to rlwrap it becomes manageable. Haskell is more elegant, Ocaml is more practical.

For the next post in the series I plan to study the most basic and simple data structure in functional programming: lists.

References

[1] OCaml.org – The Basics
[2] Real World OCaml – Chapter 2. Variables and Functions