In Chapter 10 of Real World Haskell, the authors explain how to write a parser for a PGM file.

They start with a very direct and verbose solution. Along the chapter they perform abstractions that allows code reuse. Then, they end up with a much concise and legible solution.

Part of the abstraction makes use of functors, and it’s about them that we’ll talk briefly in this post. Since functors also exists in C++, we’ll talk about them too, even though they represent different concepts in each of these languages.

### Functors in Haskell

You’re probably acquainted with maps. Nevertheless, let’s see an example in Haskell:

`map (+1) [1,2,3]`

A map receives a function that transform an element of type `a`

in a element in `a`

type `b`

and a list of elements of type `a`

. We can see this by checking the its type on ghci:

>:t map map :: (a -> b) -> [a] -> [b]

What if we want to apply a function to the elements of other structures beside lists?

Let’s start with the `Maybe`

datatype already defined on Prelude:

data Maybe a = Nothing | Just a

In this case, the example `map (+1) (Just 1)`

will not work. So, let’s write a function to apply the function only to the “internal” type of a `Maybe`

:

maybeMap:: (a -> b) -> Maybe a -> Maybe b maybeMap f (Just x) = Just (f x) maybeMap _ Nothing = Nothing

A classical example is performing a map over the elements a tree. Let’s define a datatype `Tree`

representing a binary tree:

data Tree a = Node (Tree a) (Tree a) | Leaf a deriving (Show)

To apply a function to its internal elements we can do:

treeMap :: (a -> b) -> Tree a -> Tree b treeMap f (Leaf a) = Leaf (f a) treeMap f (Node l r) = Node (treeMap f l) (treeMap f r)

We can test it by doing:

> let t = Node (Leaf 1) (Node (Leaf 2) (Leaf 3)) > treeMap (+1) t Node (Leaf 2) (Node (Leaf 3) (Leaf 4))

The Prelude allow us to implement a common interface using the typeclass `Functor`

:

class Functor f where fmap :: (a -> b) -> f a -> f b

Thus, we can specify an implementation for `Maybe`

:

instance Functor Maybe where fmap = maybeMap

… for our `Tree`

datatype:

instance Functor Maybe where fmap = treeMap

… and for lists:

instance Functor [] where fmap = map

We can then use now `fmap`

to perform mapping on lists, `Maybe`

or `Tree`

:

fmap (+1) [1, 2, 3] // [2, 3, 4] fmap (+1) (Just 10) // Just 11 fmap (+1) (Leaf 0) // (Leaf 1)

Note that we can also apply this function recursively to perform mappings into inner types, for example:

fmap (fmap (+1)) [[1,2], [3]] // [[2,3], [4]] fmap (fmap (*2)) [Just 4] // [Just 8]

The idea of functors is to perform mappings over the internal elements of datatype, but without affecting its structure.

For instance, a function that receives a `(Just 1)`

and returns `Nothing`

, is not preserving the `Maybe`

structure. The same can be said about any function that changes the topology of a `Tree`

.

So, functors are expected to satisfy two properties: identity and composability.

The **identity** property says that `(fmap id)`

is equivalent to `id`

. Since we’re not changing the internal elements neither the structure of a datatype, we are not changing the datatype at all.

The **composability** means that `(fmap f) . (fmap g)`

is equivalent to `fmap (f . g)`

For further details, this subject is pretty well explained in [2] and [3].

### Functors in C++

In C++ we also have the concept of functors, but it has a different meaning. Here a functor is an object that acts as a function.

Basically, C++ functors are classes that define the `operator()`

. Here’s an example:

struct Functor { void operator() (){ cout << "Functor\n"; } }; ... Functor f; f();

What’s the difference between overloading the `operator()`

and using any other method instead? For instance, one could define a method `apply()`

and use it like this:

struct Functor { void apply(){ cout << "Functor\n"; } }; ... Functor f; f.apply();

The answer to the question is that, besides the syntax sugar (i.e. calling `f()`

instead of `f.apply()`

), overloading the `operator()`

allows functors and function pointers to be used alike with templates. Consider the following example,

struct Functor { void operator() (){ cout << "Functor\n"; } }; void f(){ cout << "Function\n"; } template<typename F> void callFunction(F func){ func(); } int main(){ callFunction(f); // Function callFunction(Functor()); // Functor return 0; }

Here, both `f()`

and an instance of `Functor`

can be passed as parameter to `callFunction()`

. This is pretty much like the STL `sort`

function.

The advantage of functors over regular functions is that they can have internal state, like the example below.

struct Functor { int x; Functor(int _x){ x = _x; } int operator() (int y){ return x + y; } }; int main (){ Functor addTwo(2); cout << addTwo(13) << endl; return 0; }

Sure, one could achieve the same behavior with functions using global variables, but with functors we get more encapsulation.

### References

[1] Stackoverflow – C++ Functors and their uses

[2] Real World Haskell – Chapter 10. Code case study: parsing a binary data format

[3] Learn You a Haskell – Chapter 8. The Functor typeclass

It seems typo on the Functor instance of Tree. You write it `instance Functor Maybe where` instead of `instance Functor Tree where`.