In this post we are going to talk about Zippers and Comonads in Haskell. First we’ll present a basic definition of Zippers and show an application for trees. We’ll then talk about Comonads and how Zippers applied to lists can be seen as one.

The main reference for it was the corresponding chapter in Learn You A Haskell For Good [2] and a blog post relating Zippers and Comonads [5].

Definition

Zipper is a data structure first published by Gérard Huet [1] and it was designed to enable traversing and updating trees efficiently. It is called Zipper in an analogy to the movement of the zipper from clothes, because it was first applied to traverse trees and it can move up and down efficiently, like the zipper.

When traversing a tree using a zipper, we have the concept of focus which is the subtree rooted in the current node and an information that tell us from which direction we came.

We can define a simple binary tree (which we’ll call `Tree`) through the definition of a node as follows:

```data Tree a = Empty
| Node a (Tree a) (Tree a)
deriving (Show)
```

The node can either be empty or have some content and a left and right children. When we are traversing the tree we want to keep contextual information that allow us to traverse back in the tree.

In [2], this context is called a bread crumb, in a reference to the tale Hansel and Gretel in which the kids use bread crumbs to find the way back home.

To be able to return to the previous node, we need to know whether we came taking the right or the left child and also the subtree we decided not to take. This idea is structure as the following datatype that we name `Move`:

```data Move a = LeftMove a (Tree a)
| RightMove a (Tree a)
deriving (Show)
```

A move only allows us to go one step back, but in our case we want to be able to go back to the root of the tree, so we keep a list of moves:

```type Moves a = [Move a]
```

Given this, we can define function to go down in the tree, taking the left child (`goLeft`) or the right one (`goRight`):

```goLeft :: (Tree a, Moves a) -> (Tree a, Moves a)
goLeft (Node x l r, bs) = (l, LeftMove x r:bs)
```

In `goLeft`, we pattern match a node to get the current element, the left subtree (`l`) and the right subtree (`r`). We also need the list of movements `bs`. What we do is to move to the left node at the same time that we add a `LeftMove` to our list of moves.

Note that the `:` operator has lower priority than the function application, so

`LeftMove x r:bs` is equivalent to `(LeftMove x r):bs`.

We do an analogous operation for the right move:

```goRight :: (Tree a, Moves a) -> (Tree a, Moves a)
goRight (Node x l r, bs) = (r, RightMove x l:bs)
```

Given the current node and a list of moves performed to get there from the root, we can go easily up in the tree:

```goUp :: (Tree a, Moves a) -> (Tree a, Moves a)
goUp (t, LeftMove x r:bs) = (Node x t r, bs)
goUp (t, RightMove x l:bs) = (Node x l t, bs)
```

Through pattern matching, we can decide whether we came from a left or a right movement, retrieve the parent node and the other subtree that we didn’t take. With that we can reconstruct the subtree in the level above.

We then conveniently call this tree enhanced with the “breadcrumbs” as the `Zipper`:

```type Zipper a = (Tree a, Moves a)
```

While researching about Zippers, I found a blog post from Dan Piponi, relating Zippers to Comonads. It’s really nice because he writing a simple 1-D game of life and Zippers and Comonads are used to implement that in an elegant way.

A Comonad is a structure from Category Theory that represents the dual of a Monad. But for our purposes, we don’t need to know about it.

For his game, the author starts by defining an universe:

```data Universe x = Universe [x] x [x]
```

It’s basically a Zipper over a list, where the current element is represented by the second parameter, the first and third parameters represent the elements to the left and to the right of the element in focus, respectively.

One nice instance of this data type is representing the set of integers focusing in one particular element, say 0 for example:

```let Z = Universe [-1,-2..] 0 [1,2..]
```

In an analogy to the tree zipper, we can define functions to change the focus back and forth, in this case left and right. The implementation of these moves is straightforward:

```goRight (Universe ls x (r:rs)) = Universe (x:ls) r rs
goLeft  (Universe (l:ls) x rs) = Universe ls l (x:rs)
```

The author then defines a new typeclass called `Comonad` which is a special case of a `Functor`. This structure is available at `Control.Comonad` but it belongs to the package `comonad` which is not installed by default, so we need to get it through `cabal`:

```cabal install comonad
```

The documentation [6] for the Comonad says we need to implement the following methods:

```extract :: w a -> a
duplicate :: w a -> w (w a)
extend :: (w a -> b) -> w a -> w b
```

In the original post [5], `extract` is called `coreturn`, `duplicate` is called `cojoin`. The `=>>` still exists and corresponds to `extend` which has the default implementation:

```extend f == fmap f . duplicate
```

So in order to make the type `Universe` comonadic, we must make sure it implements `Functor`. Thus we can do:

```import Control.Comonad

instance Functor Universe where
fmap f (Universe ls x rs) = Universe (map f ls) (f x) (map f rs)
```

`fmap()` basically applies the function `f` to the entire list that this zipper represents. Now we can provide an implementation for the Comonad typeclass:

```instance Comonad Universe where
extract (Universe _ x _) = x
duplicate x = Universe (tail \$ iterate goLeft x) x (tail \$ iterate goRight x)
```

If we analyze the type description of `duplicate` and using the wrap analogy for Monads, we see that it’s wrapping an already wrapped element again.

The focus of this instance of `Universe` is the universe `x` we received as a parameter. The left list is an infinite list of all universes in which we go to the left of the current state of universe `x`. The right list is analogous. This forms a set of “parallel” universes.

Parallels Universes – Through the Looking-Glass

The `extract()` function extracts the element in focus of the universe.

With that definition in hand, we can write a rule to act upon a universe. For the game of life we can work with a universe of boolean values, representing dead or alive. A rule determines the next value of the element in focus `x` based on its surroundings. We can define the following rule:

```rule :: Universe Bool -> Bool
rule (Universe (l:_) x (r:_)) = not (l && x && not r || (l == x))
```

Before applying that, let’s write a printing function for a Universe. Since it is infinite, we can only print a small sample of it. First we define a function to go n positions to the right (if `n` is positive) or to the left (if `n` is negative):

```-- Move n positions to the right or i position to the left (if i is negative)
shift :: Universe a -> Int -> Universe a
shift u n = (iterate (if n < 0 then left else right) u) !! abs n
```

and then a function to get a sample of `len` elements to the left of `x` and `len` to the right:

```-- Return the array [-len, len] surrounding x
sample :: Int -> Universe a -> [a]
sample len u = take (2*len) \$ half \$ shift (-len) u
where half (Universe _ x ls) = [x] ++ ls
```

and finally a function to convert an array of booleans to a string:

```boolsToString :: [Bool] -> String
boolsToString = map (\x -> if x then '#' else ' ')
```

Combining these functions yields simple way to print `Universe` with an window of size 20:

```toString :: Universe Bool -> String
toString = boolsToString . sample 20
```

We can print a sample universe:

```toString  (Universe (repeat False) True (repeat False))
```

Notice that the rule applies only to the focused object. If we want to apply to ‘all’ elements, we can use the `extend` or the operator `(=>>)`

```toString \$ (=>> rule)  (Universe (repeat False) True (repeat False))
```

To run some iterations and print the whole process we can add some boilerplate code:

```putStr . unlines . (take 20) . (map toString) \$ iterate (=>> rule) example
where example = (Universe (repeat False) True (repeat False))
```

This will print out 20 lines of a Sierpinski Triangle!

Sierpinsk Triangle Construction

Conclusion

This subject was the last chapter of the Learn You a Haskell for Great Good! book which is excellent for newbies likes me. I’m still halfway in the Real World Haskell, which is more heavy-weight, but I also plan to finish it.

Zippers is the first non-trivial functional data structure I’ve learned. There is a famous book by Chris Okasaki, Purely Functional Data Structures, which I’m pretty excited to read.

Once again, going a step further and reading more about the subject, led me to learn a bit about Comonads. I’ve heard about that before and it sounded very complicated, but the its application to the example above is not very difficult.

References

[1] Functional Pearl: The Zipper – Gérard Huet – Journal of Functional Programming.
[2] Learn you a good haskell – Zippers
[3] Hackage: AvlTree
[5] A Neighborhood of Infinity – Evaluating Cellular Automata

The writer monad is useful when we want to attach some kind of logging to our value. The monadic type we’re going to work with takes a value of type `a` and the “logging” type `w`.

```newtype Writer w a = Writer {runWriter :: (a, w)}
```

The definition of the Writer Monad is:

```import Data.Monoid

instance (Monoid w) => Monad (Writer w) where
return x = Writer (x, mempty)
(Writer (x, v)) >>= f = let (Writer (y, v')) = f x
in Writer (y, v `mappend` v')
```

`Monoid` is a typeclass with a set of common functions that a monoid type must implement, which are, `mempty`, `mappend` and `mconcat`.

```class Monoid m where
mempty :: m

mappend :: m -> m -> m

mconcat :: [m] -> m
mconcat = foldr mappend mempty
```

The `mempty` function returns an empty instance of the type `m`, `mappend` is any binary function over 2 instances of type `m` and `mconcat` is a generalization of `mappend`, a function over a list of instance of type `m`. The `Monoid` typeclass provides a default implementation of this function consisting of a `foldr`.

For example, the list type `[]` is can be seen as a monoid with the following implementation:

```instance Monoid [a] where
mempty = []
mappend = (++)
```

So going back to the Writer monad definition, we have that the “logging” type must be a monoid type. And we define the monad over the type `(Writer w)` which can be seem as wapper on the type `a`.

The `return` function consists in wrapping an instance of the type a into the type `(Writer w)`, so the simplest one is to just have an empty instance of type `w`. Since it’s a monoid, we can use the functiion `mempty`.

The chain operator `(>>=)` takes an instance of type a wrapped into the `(Wrapper w)` and pass it to a function that operates on the type a and wraps the resulting type into `(Wrapper w)` again. In our implementation of the monad for `(Writer w)`, we extract the element `x`, apply `f` on it and compose a new log message combining the incoming log message and the one returned by function `f`, using the `mappend` function.

This implementation is available in the `Control.Monad.Writer` module. One simple example that uses the Writer Monad is a function that operates over numbers and log them:

```import Control.Monad.Writer

logNumber :: Int -> Writer [String] Int
logNumber x = Writer (x, ["Got number: " ++ show x])

multWithLog :: Writer [String] Int
multWithLog = do
a <- logNumber 3
b <- logNumber 5
return (a*b)
```

Which will return

```ghci> runWriter multWithLog
(15,["Got number: 3","Got number: 5"])
```

The state monad when we want to carry some internal state along our computation. One example is a parser code presented in [2], where the authors use the partial parsed code as an internal state.

In the same fashion as the writer monadic type, we have a type that depends on two type parameters, one of them representing the wrapped value and the other the type of the state.

The difference here is that instead of a pair of the two types, our new type is actually a function that receives a value of type s and wraps it into a tuple together with an instance of type a.

```newtype State s a = State { runState :: s -> (a,s) }
```

The definition of the State Monad is given by:

```
return x = State \$ \s -> (x,s)
(State h) >>= f = State \$ \s -> let (a, newState) = h s
(State g) = f a
in  g newState
```

The implementation of the `return` function is straightforward, it just returns a function that receives a state and returns a pair of `x` with this state. This function is then wrapped into the type `State`.

The chain operator is more involved. To extract the value of type a from a `State` type, we first need to extract the function `h` from it (by pattern matching `(State h)`), and then extracting the value a by providing an state to it. Note that we can assume this state `s` because this is done within a lambda that receives a state.

After extracting the value a from the first parameter of `(>>=)`, we can finally apply the function `f` to it. This will return an instance of `State`, from which we extract yet another function `g` that receives an state and returns a pair. We use the state that resulted after applying the state `s` to function h and return the pair resulting from applying the new state to this function.

The State Monad is also useful when we’re dealing with random number generation. In Haskell you have to keep a reference to the random number generator otherwise it will always generate the same number. The following example uses the random genetor as a state, which makes it simple to generate multiple random numbers:

```import System.Random

randomSt :: (RandomGen g, Random a) => State g a
randomSt = State random

threeCoins :: State StdGen (Bool,Bool,Bool)
threeCoins = do
a <- randomSt
b <- randomSt
c <- randomSt
return (a,b,c)
```

The Reader Monad [3] is similar to the State monad but in this case we have a state that is immutable, which we can see as an environment. The Reader type is a wrapper of a function that receives the environment `e` and returns an instance of type `a`. Since the environment is immutable, there’s no need to return it along with our value.

```newtype Reader e a = Reader { runReader :: e -> a }
```

```instance Monad (Reader e) where
return a = Reader \$ \_ -> a
in h e
```

For the `return` function, we have an analogous function as for the State Monad, but since in this case the function doesn’t need to return the environment, we return a function that doesn’t care about the parameter.

The chain operator is also a special case of the State Monad chain operator. We first extract the function `g` via pattern matching, within the lambda function we provide the environment `e` to `g` in order to retrieve our value of type `a` and apply the function `f`.

The function `f` will return another function `h` wrapped inside a Reader instance which we can extract using pattern matching again. This function wrapped back into a Reader instance is essentially the result of the chain operator.

References

[1] Learn You a Haskell for Great Good! – For a Few Monads More
[2] Real World Haskell – Chapter 10. Code case study: parsing a binary data format

Introduction

In this post we write some notes about monads and describe the Maybe and List Monad types. My main reference was Chapter 14 from Real World Haskell and A Fistful of Monads from Learn You a Haskell for Great Good!

I’ve written about typeclasses in an old post (in Portuguese). Haskell defines a typeclass called `Monad`:

```class Monad m where

-- inject
return :: a -> m a

-- chain or bind
(>>=)  :: m a -> (a -> m b) -> m b

(>>) :: m a -> m b -> m b
x >> y = x >>= \_ -> y

fail :: String -> m a
fail msg = error msg
```

When a type `m` implements this typeclass it is considered a monadic type. Note that `>>` and `fail` have default implementation, but it’s possible to override them.

One simplistic way to get a grasp of Monads is to think that the type `m` is a kind of a box. Then the `return` function puts the type `a` inside the box `m`. Also, the chain operator `(>>=)` receives a box containing `a` and a function that takes `a` and return the type `b` inside a box.

It’s easier to understand those functions with an example. Let’s consider the simplest monadic type, the Maybe Monad.

The `Maybe` type can be define as an Algebraic Data Type as follows:

```data Maybe a = Nothing | Just a
deriving (Show)
```

The standard implementation for the `Monad` typeclass for the type `Maybe` is the following

```instance Monad Maybe where
-- chain
Just x >>= k  =  k x
Nothing >>= _ =  Nothing

-- inject
return x      =  Just x

fail _ = Nothing
```

In the first implementation for the chain operator, `k` is a function that receives the value x wrapped inside `Just` and returns another value wrapped inside `Maybe`.

Note that `Maybe` overrides the default implementation for `fail`.

Chaining

The chain operator has this name because we can concatenate several functions in a chain. Consider the following example using the Maybe monad:

```f1 a | a >= 0 = Just (sqrt a)
| otherwise = Nothing

f2 b | b /= 0 = Just (1 / b)
| otherwise = Nothing

f3 c = Just (round c)

-- Chaining f1, f2 and f3
f x = f1 x >>= f2 >>= f3
```

This chaining of Maybe monads is useful when we need to execute several functions such that if an error occurs, we stop further processing.

If we look at the default definition of the `(>>)` operator, it basically doesn’t pass the value from the previous function forward, so the function to the right of `(>>)` doesn’t have an input. For example, we can define a new function `f4`:

```f4 = Just 100.1
-- Chaining f1, f2, f4 and f3
f x = f1 x >>= f2 >> f4 >>= f3
```

The ‘do’ notation

The chaining of the `(>>=)` operator has an alternative syntax using the keyword do. In this case we need to explicitly deal with the returned values and function parameters and pass to the following function. For the example above we would have:

```g_alt x = do
y <- f1 x
f2 y
z <- f4
f3 z
```

It’s more verbose, but on the other hand the variables might be both used in the same scope. In the next example, `x` and `y` are available to be used in the last function:

```foo = do
x <- Just 3
y <- Just 4
Just (x * y)
```

If we go with the `(>>=)` operators, we would have a less elegant solution with nested functions to keep both variables in the same scope:

```foo = Just 3 >>= (\x ->
Just 4 >>= (\y ->
Just (x * y)))
```

Lists also implement the `Monad` typeclass. The standard implementation is the following:

```instance Monad [] where
return x = [x]
xs >>= f = concat (map f xs)
fail _ = []
```

In the analogy of boxes, we may think that the type `[]` can hold more than one item (of the same type). The return function inserts a single element in the box.

The bind operator receives a list of elements and a function that applies to elements ans return another list (its elements can have a different type).

If we have a function that receives an element and returns a list of one element, we have just a kind of map. For example:

```[1, 2, 3] >>= \x -> [x^2] // [1, 4, 9]
```

If it returns a list with two or more element, we can identify it’s performing a cartesian product:

```[1, 2, 3] >>= \x -> [2*x-1, 2*x]
```

Let’s consider an example using with nested functions

```f = [1, 2, 3] >>= (\n -> ['a', 'b'] >>= \m -> return(n, m))
```

For which we get `[(1,'a'),(1,'b'),(2,'a'),(2,'b'),(3,'a'),(3,'b')]`, or using the `do` syntax:

```f = do
n <- [1, 2, 3]
m <- ['a', 'b']
return (n, m)
```

If we compare with the list comprehension syntax that gets the same output, we can see how similar they are:

```[(n, m) | n <- [1, 2, 3], m <- ['a', 'b']]
```

When we implement the Monad typeclass for a given type, Haskell doesn’t have means to check the properties that actually makes the type a Monad. So we have to guarantee it ourselves when declaring our type monadic by verifying the following 3 properties:

Left identity. `return x >>= f` is equivalent to `f x`

In our analogy, it means that if we put our element in the box (`return`) and apply the operator (`>>=`), it must extract this element and apply `f`, which should be the as applying it directly.

Right identity. `m >>= return` is equivalent to `m`

It means that we are sending the element inside a box `m` and applying the operator `(>>=)`, which will extract the element and just put it again inside the box (`return`), so the same thing that entered must come out, in this case, `m`.

Associativity. states that `(m >>= f) >>= g` is equivalent to `m >>= (\x -> f x >>= g)`

In a expression, the associativity property means that we can execute the operations in a chain in any order (e.g. `(a + b) + c == a + (b + c)`).

This is partially true here, because if we try to have `(m >>= f) >>= g` equal to `m >>= (f >>= g)`, which is not correct because the operator is not symmetric (it requires a monadic type and not a function that returns one).

To solve this problem, we can curry the function applying the first parameter. Since f has type `(Monad m) => a -> m b`, then `(f x)` for `x` of type `a`, we have the type `m b`.

In [2], the authors define a new symmetric operator `(<=<)` that makes it easy to spot the associative law:

```(<=<) :: (Monad m) => (b -> m c) -> (a -> m b) -> (a -> m c)
f <=< g = (\x -> g x >>= f)
```

Now we can say that `f <=< (g <=< h)` should be the same as `(f <=< g) <=< h`.

We must get x from somewhere though and we can do this by wrapping it inside a function, thus `(f >>= g)` becomes `(\x -> f x >>= g)`.

Note however that we’re not actually executing the functions in the chain in different order, because we lifted the operation to another function that will only the executed after it has the element from the left of the operator `(>>=)`

References

[2] Learn You a Haskell for Great Good! – A Fistful of Monads

Functors

In Chapter 10 of Real World Haskell, the authors explain how to write a parser for a PGM file.

They start with a very direct and verbose solution. Along the chapter they perform abstractions that allows code reuse. Then, they end up with a much concise and legible solution.

Part of the abstraction makes use of functors, and it’s about them that we’ll talk briefly in this post. Since functors also exists in C++, we’ll talk about them too, even though they represent different concepts in each of these languages.

You’re probably acquainted with maps. Nevertheless, let’s see an example in Haskell:

`map (+1) [1,2,3]`

A map receives a function that transform an element of type `a` in a element in `a` type `b` and a list of elements of type `a`. We can see this by checking the its type on ghci:

```>:t map
map :: (a -> b) -> [a] -> [b]
```

What if we want to apply a function to the elements of other structures beside lists?

Let’s start with the `Maybe` datatype already defined on Prelude:

```data Maybe a = Nothing | Just a
```

In this case, the example `map (+1) (Just 1)` will not work. So, let’s write a function to apply the function only to the “internal” type of a `Maybe`:

```maybeMap:: (a -> b) -> Maybe a -> Maybe b
maybeMap f (Just x) = Just (f x)
maybeMap _ Nothing  = Nothing
```

A classical example is performing a map over the elements a tree. Let’s define a datatype `Tree` representing a binary tree:

```data Tree a =   Node (Tree a) (Tree a)
| Leaf a
deriving (Show)
```

To apply a function to its internal elements we can do:

```treeMap :: (a -> b) -> Tree a -> Tree b
treeMap f (Leaf a)   = Leaf (f a)
treeMap f (Node l r) = Node (treeMap f l) (treeMap f r)
```

We can test it by doing:

```> let t = Node (Leaf 1) (Node (Leaf 2) (Leaf 3))
> treeMap (+1) t
Node (Leaf 2) (Node (Leaf 3) (Leaf 4))
```

The Prelude allow us to implement a common interface using the typeclass `Functor`:

```class Functor f where
fmap :: (a -> b) -> f a -> f b
```

Thus, we can specify an implementation for `Maybe`:

```instance Functor Maybe where
fmap = maybeMap
```

… for our `Tree` datatype:

```instance Functor Maybe where
fmap = treeMap
```

… and for lists:

```instance Functor [] where
fmap = map
```

We can then use now `fmap` to perform mapping on lists, `Maybe` or `Tree`:

```fmap (+1) [1, 2, 3] // [2, 3, 4]
fmap (+1) (Just 10) // Just 11
fmap (+1) (Leaf 0) // (Leaf 1)
```

Note that we can also apply this function recursively to perform mappings into inner types, for example:

```fmap (fmap (+1)) [[1,2], [3]] // [[2,3], [4]]
fmap (fmap (*2)) [Just 4] // [Just 8]
```

The idea of functors is to perform mappings over the internal elements of datatype, but without affecting its structure.

For instance, a function that receives a `(Just 1)` and returns `Nothing`, is not preserving the `Maybe` structure. The same can be said about any function that changes the topology of a `Tree`.

So, functors are expected to satisfy two properties: identity and composability.

The identity property says that `(fmap id)` is equivalent to `id`. Since we’re not changing the internal elements neither the structure of a datatype, we are not changing the datatype at all.

The composability means that `(fmap f) . (fmap g)` is equivalent to `fmap (f . g)`

For further details, this subject is pretty well explained in [2] and [3].

Functors in C++

In C++ we also have the concept of functors, but it has a different meaning. Here a functor is an object that acts as a function.

Basically, C++ functors are classes that define the `operator()`. Here’s an example:

```struct Functor {
void operator() (){
cout << "Functor\n";
}
};
...
Functor f;
f();
```

What’s the difference between overloading the `operator()` and using any other method instead? For instance, one could define a method `apply()` and use it like this:

```struct Functor {
void apply(){
cout << "Functor\n";
}
};
...
Functor f;
f.apply();
```

The answer to the question is that, besides the syntax sugar (i.e. calling `f()` instead of `f.apply()`), overloading the `operator()` allows functors and function pointers to be used alike with templates. Consider the following example,

```struct Functor {
void operator() (){
cout << "Functor\n";
}
};
void f(){
cout << "Function\n";
}

template<typename F>
void callFunction(F func){
func();
}

int main(){
callFunction(f); // Function
callFunction(Functor()); // Functor
return 0;
}
```

Here, both `f()` and an instance of `Functor` can be passed as parameter to `callFunction()`. This is pretty much like the STL `sort` function.

The advantage of functors over regular functions is that they can have internal state, like the example below.

```struct Functor {
int x;

Functor(int _x){
x = _x;
}
int operator() (int y){
return x + y;
}
};

int main (){