An Introduction to the Parsec Library

Introduction

The process of compiling or interpreting programs requires, as one of its steps, parsing the source code and structuring them. More specifically, we have a step to convert a string into a set of tokens, called lexical analysis, which is carried out by a lexer or tokenizer.

After that, we structure those tokens in a such a way that encodes meaning of these tokens, such as abstract syntax trees (AST). This step is called parsing and is out by a parser.

The parsec library

Parse combinators are high-order functions that combine smaller combinators (parsers) to build more complex ones. This resembles the idea of context free grammars (CFG), which we talked about in a previous post, where we have productions of form

S \rightarrow A | B

Here, S can be formed by either from A or B, which in turn, can be other productions or terminals.

The parsec library is an implementation of a parser combinator in Haskell. We talked about combinators in Haskell previously (in portuguese).

Parsec vs. Bison/Yacc/Antlr

All Bison, Yacc and Antlr are not actual parsers, but rather parsers generators. They take a grammar file and generate parsers for the languages that can be described by those grammars.

Parsec, on the other hand, is a parser that you write yourself.

In this post we’ll go through several basic concepts of the parsec library, using as the main reference, the book Real World Haskell, Chapter 16.

The source code for all the examples to follow can be found on the blog’s github repository.

Basic combinators

The basic idea of a parser is that it takes an input (for example, a string), it consumes the characters of this string until it’s done with what it’s supposed to parse and then pass the remaining (unparsed) string along, which might be used as input to subsequent parsers.

One of the simples parsers is the one that only consumes a single specific character. There’s a parser named char at Data.Char library, so let’s write one that parses the letter a:

import Data.Char (char)
charAParser = (char 'a')

To test our parser with an input, we use the parse function form Text.Parsec

import Text.Parsec
-- parsedInput = parse someParser "source name" "some input"

This function takes as input a parser, a source name and the actual input. The source name parameter is not important for us now, so we can pass the empty string. Let’s write a simple wrapper to avoid boilerplate:

test p = parse p ""

We can now test our parser with same sample inputs:

> test charAParser "a"
Right 'a'
> test charAParser "ab"
Right 'a'
> test charAParser "ba"
Left (line 1, column 1):
unexpected "b"
expecting "a"

It extracts the first character of the input string if it’s the 'a' character, otherwise it throws an error. If we want to match any char, there’s also the function anyChar. Running it with the same examples:

> test anyChar "a"
Right 'a'
> test anyChar "ab"
Right 'a'
> test anyChar "ba"
Right 'b'

Note that it doesn’t fail for strings starting with 'b'. So far our parsers only match one character, so for example, the string "ab", it only returns the first character.

We can use a parser for the string too. There’s the string combinator but let’s develop our own and show how we can combine combinators to form new ones. There’s the many combinator that applies the combinator passed as argument until it fails.

Thus, we can write a string parser as many anyChar:

stringParser:: Parsec String st String
stringParser = many anyChar

Now let’s try it with the string "ab":

> test stringParser "ab"
Right "ab"

More useful than matching all characters is matching all except some, so we know when to stop parsing. For that, we can use noneOf instead of anyChar. It takes a list of characters as parameter and matches any character that is not on that list.

Let’s now write a wordParser, which keeps parsing all characters until it finds an whitespace:

wordParser:: Parsec String st String
wordParser = many $ noneOf [' ']

Let’s try it on the most classical string example:

> test wordParser "hello world"
Right "hello"

Note that our parsers are throwing away all the unparsed strings. How can we parse the remaining, unparsed string?

The two flavors of the Parsec library: Monads and Applicatives Functors

We’ve talked about Functors and Monads before, but not about Applicatives functors.

Intuitively, they are a structure in between Functors and Monads, that is, they’re more complex and general than Functors but less than Monads. We can also make the analogy of wrappers that we did for monads.

Originally, the Parsec library was written with Monads in mind, but Applicative functors were introduced after that and using them to write parsers usually leads to more clear syntax. So, in this post, we’ll use the applicative flavor to write our parsers.

Here, we’ll only provide an overview of some of the main applicative operators. For further details, the book Learn You a Haskell for Great Good has a nice introduction to Applicatives.

Operators Cheat Sheet. We can use the Maybe typeclass to illustrate the main applicative operators, since it implements an applicative functor.

() Unwrap the contents of both sides, combine them and wrap again

> Just (+3)  Just 9
Just 12

(*>) Unwrap the contents of both sides, but discard the result on the left

> Just (+3) *> Just 9
Just 9
> Just 7 *> Just 8
Just 8

(<*) Unwrap the contents of both sides, but discard the result on the right.

> Just 7 <* Just 8
Just 7

() Unwrap the contents of the right, combine the left and right arguments and return

> (+3)  Just 9
Just 12

(<$) Unwrap the contents of the right, but only wrap the one to the left

> 3 <$ Just 9
Just 3

This analogy of wrappers applied to parsers is not as natural though. In this case, we can think of unwrapping as executing the parser, by consuming the input and wrapping the result as getting the parsed token. The unparsed string is always carried over from parser to parser.

Hopefully with the next examples this will become clearer:

Parsing the second word

If we are to get the token from the second parser instead of the first, we need to execute both parsers but ignore the result of the first. Thus, we can use the operator (*>) to obtain something like

wordParser *> wordParser

This won’t quite work because the first parser doesn’t consume the whitespace, so the second parser will stop before consuming anything. We can fix that by consuming the whitespace:

wordParser *> (char ' ') *> wordParser

So let’s write:

secondWordParser:: Parsec String st String
secondWordParser = wordParser *> (char ' ')  *> wordParser

and now we can test:

> test secondWordParser "ab cd"
> cd

Parsing two words

We can also return both tokens if we use the operator () and then combine them into a list:

twoWordsParser:: Parsec String st [String]
twoWordsParser = listfy  wordParser  ((char ' ') *> wordParser)
                   where listfy a b = [a, b]

Parsing multiple words

Generalizing, we can parse multiple words with the aid of the many combinator:

wordsParser:: Parsec String st [String]
wordsParser = (:)  wordParser  many ((char ' ') *> wordParser)

We could actually write this using the sepBy1 parser, which parses a list of tokens separated by a separator and requires the list to have at least one element:

wordsParserAlt:: Parsec String st [String]
wordsParserAlt = sepBy1 (char ' ')

Simple CSV parser

With what we’ve seen so far, we can write a very basic CSV parser in 4 lines of code.

csvParser:: Parsec String st [[String]]
csvParser = lineParser `endBy` newline <* eof
              where lineParser = cellParser `sepBy` (char ',')
                    cellParser = many $ noneOf ",\n"

Note that it doesn't handle some corner cases like escaped commas within cells. For a full example, refer to either [1] or [2].

Choosing combinators

Recall that in Context Free Grammars, we can have production rules of the type:

S \rightarrow A | B

which means that S can be generated either from A or B. In Parsec, we can express this option using the () operator. Let’s write a simple parser that parses either the "cat" or "dog" strings:

dogCatParser:: Parsec String st String
dogCatParser = (string "dog")  (string "cat")

Testing on some inputs:

> test dogCatParser "dog"
Right "dog"
> test dogCatParser "cat"
Right "cat"
> test dogCatParser "elephant"
Left (line 1, column 1):
unexpected "e"
expecting "cat" or "dog"

Let’s write another example with different animal names:

camelCatParser:: Parsec String st String
camelCatParser = (string "camel")  (string "cat")

and try again with the input "cat":

> test camelCatParser "cat"
Left (line 1, column 1):
unexpected "t"
expecting "camel"

The parser failed because the strings have common prefix. It started matching the camel parser, but it also consumed the "ca" characters and then it failed to match the cat parser.

The try combinator

To avoid this, problem, there’s the try combinator, which will make a parser to not consume its input if it fails to match:

camelCatTryParser:: Parsec String st String
camelCatTryParser = try (string "camel")  (string "cat")

which works as expected:

> test camelCatTryParser "cat"
Right "cat"

We can see that it’s straightforward to convert a standard context free grammar into a haskell program using parsec.

Simple Expression Parser

So far we our parsers have only returned strings and list of strings. We can use data types to structure our parsed data in a way that is easier to evaluate later.

For our example, we’ll build a very simple parser for expressions that only contain + and - binary operators, terminals are all integers and all binaries are surrounded by parenthesis so we don’t have to handle precedence. Examples of valid expressions are "12", "(1+2)", "((3+4)-5)", whereas "1+2" is invalid (no parenthesis).

The first thing we want to do is to define our data types. Our number type, TNumber, is just an alias to Int. Our operator type, TOperator can be one of addition (TAdd) or subtraction (TSubtract). Finally, the expression is either binary (TNode) or a number (TTerminal).

type TNumber = Int

data TOperator = TAdd
               | TSubtract
                 deriving (Eq, Ord, Show)

data TExpression = TNode (TExpression) TOperator (TExpression)
                 | TTerminal TNumber
                   deriving (Show)

From what we’ve seen so far, it’s not very complicated to write parsers for TNumber and TOperator:

numberParser:: Parsec String st TNumber
numberParser = read  (many $ oneOf "0123456789")

operatorParser:: Parsec String st TOperator
operatorParser = chooseOp  (oneOf "+-")
                   where chooseOp '+' = TAdd
                         chooseOp '-' = TSubtract

For the expression we have two choices. Either we parse another expression enclosed in parenthesis or we parse a terminal. In the first case, we call the binaryExpressionParser which looks for the left expression, the operator and then the right expression.

expressionParser:: Parsec String st TExpression
expressionParser = (between (char '(') (char ')') binaryExpressionParser) 
                   (TTerminal  numberParser)

binaryExpressionParser:: Parsec String st TExpression
binaryExpressionParser = TNode  expressionParser  operatorParser  expressionParser

And that’s it! We can now run an example with a valid expression:

> test expressionParser "(123+(324-456))"
Right (JNode (JTerminal 123) JAdd (JNode (JTerminal 324) JSubtract (JTerminal 456)))

The advantage of having this AST is that it’s now very simple to evaluate:

evaluate:: TExpression -> TNumber
evaluate (TNode exp1 TAdd exp2)      = (evaluate exp1) + (evaluate exp2)
evaluate (TNode exp1 TSubtract exp2) = (evaluate exp1) - (evaluate exp2)
evaluate (TTerminal v)               = v

And the final test:

> let Right x = expressionParser "(123+(324-456))"
> evaluate x
-9

It works! We implemented a simple parser and interpreter for a very limited arithmetic expression. There are much better tools to do expression parsing (see [5] for a tutorial), but it’s out of the scope of this post.

Conclusion

We’ve learned the basics of the Parsec library and built some non-trivial parsers gluing together basic parsers using combinators. We even started scratching the parsing of programming languages by writing a parser for arithmetic expressions.

The Parsec applications presented in the Real World Haskell book are great. I felt that the content was a bit hard to follow, but writing helped me get a better understanding of the subject.

References

[1] Real World Haskell – Chapter 16
[2] A gentle introduction to Parsec
[3] StackOverflow – Parsec vs Yacc/Bison/Antlr: Why and when to use Parsec?
[4] Wikipedia – Parser Combinator
[5] Haskell Wiki – Parsing expressions and statements

Turing Machines and Undecidability

ullman

Jeffrey Ullman is a professor of Computer Science at Stanford. He is famous for his book Introduction to Automata Theory, Languages, and Computation. He teaches the Automata course in Coursera, and I’ve just finished the most recent edition.

Ullman’s research interests include database theory, data integration, data mining, and education using the information infrastructure. He was Sergey Brin’s PhD advisor and won the Knuth Prize in 2000 [1].

We divide this post in two parts. In the first part, we’ll provide a brief overview of automata theory, including basic concepts and terminology. In Part 2, we’ll focus on the limitations of automatons by discussing undecidability.


Part 1: Automata Theory

In the following sections, we’ll go from the more limited automata, to a push-down automata and then to a turing machine.

Finite automata

We can think of a finite automaton as a state machine. It’s represented by a directed graph where the nodes represent states and each edge (u, v) connecting two states has a label, which is a symbol from a given alphabet \Sigma. Consider an initial state q_0, a set of final states F and a string w of the alphabet \Sigma.

To be more generic, we can describe the edges of the graph as a transition function. It takes as input a state p and a symbol c and outputs a new state q. We denote such function as \delta, so we have \delta(p, c) \rightarrow q.

x2

Figure 1: Sample DFA

We can then decide if input w is accepted by this automaton by simulating it on the automaton. That consists in consuming each character c from w at a time and moving from one state to another by following the edge labeled c. If there’s no such edge, we can assume it goes to a dead-end state where it gets stuck and is rejected. If after consuming all the input it is in one of the final states, we say it’s accepted, otherwise it’s rejected.

If there’s always at most one edge to take at any point of the simulation, we have a deterministic way to decide wheter w is accepted, so we call such class of automaton, determinist finite automata or DFA for short. In case it has multiple choices, we have a non-deterministic finite automata, or NFA.

Surprisingly, a DFA is as powerful as a NFA, because we can simulate a NFA with a DFA, though such DFA might have an exponential number of states. DFA’s are also as powerful as regular expressions, which is out of the scope of this post.

We define a set of strings as a language. We say that an automaton defines a given language if it accepts all and only the strings contained in this language.

In the case of finite automata, the class of languages that they define is called regular languages.

Push-down automata

A push-down automata is essentially a finite automata equipped with a stack. The symbols that can go into the stack are those from the input and also some additional ones. In this automaton, the transition function takes as input, the state p, the input symbol c and the element at the top of the stack X. It outputs the new state q and the new top of the stack that will replace the top X, which can be more than one element or event the empty symbol, meaning that we didn’t put any element on top.

This type of automaton can be shown to be equivalent to a context-free grammar (CFG), which is out of the scope of this post as well, but it’s very common for describing programming languages.

The set of languages for which there is a push-down automata that define them, is called context-free languages, which contains the regular languages.

Turing machines

What could we accomplish with a finite automaton with 2 stacks? Quite a bit. Such automata are called two-push down stack automata. It’s possible to show that a finite automaton with 2 stacks is equivalent to a Turing machine.

So, what is exactly a Turing machine? It’s a device with an state, an infinite tape and a needle that points to a given position on this tape. The input w is initially written on the tape and the needle points to the first character of w. All the other positions in the tape contains the blank symbol.

Figure 2: A concept of the Turing Machine

The transition function that takes as input the state p and the input symbol pointed by the needle c and it outputs the new state q, the new symbol that will be written replacing c and the direction that the needle moves after that (either right or left). In case there’s no defined transition function for a state and input symbol, we say that the machine halts.

There are two main flavors of TM regarding when an input considered accepted: On options is that the TM has a set of final states and whenever we get to one of that states, we accept. The other option is by accepting when the TM halts.

It’s possible to show that in the general case both versions are equivalent, in the sense that both define the same class of languages, which was historically denoted by recursive enumerable languages.

An algorithm for a decision problem is a special type of TM that accepts by final state and it’s always guaranteed to halt (whether or not it accepts). The class of languages that are defined by algorithms are called recursive languages.

Extending Turing machines

We can simulate more general TMs using our definition of TM. For example, we can model a TM with multiple tracks. Instead of having a single element, each position on the tape can contain a tuple, where each element of the such tuple represents the element on each tape with the same position.

We can even have multiple needles, by using a dedicated tape for representing the current position of each additional needle.


Part 2: Undecidability

In this second part of the post we’ll discuss about undecidability. A problem is said undecidable if there’s no algorithm to solve it, that is, no TM that is guaranteed to halt on all its inputs.

The Turing Machine Paradox

Question: Are Turing machines powerful enough so that for any language there exists a corresponding TM that defines it?

Not really. We will next define a language that creates a paradox if a TM exists for it. Before that, we first discuss how we can encode TM’s as strings so it can be used as input to other TM’s.

Encoding Turing machines. Given a description of a Turing machine (the input, the transition functions), it’s possible to encode it as a binary string. Each possible Turing machine will have a distinct binary representation. We can then convert these strings to integer. Since two strings can be converted to the same integer because of leading zeroes (e.g. both 010 and 10 map to 2), we append a 1 before all the strings (e.g. now 1010 maps to 10 and 110 to 6).

We can now order all Turing machines by their corresponding integer representation. Note that some integers might not be encodings of an actual TM, but for simplicity, we can assume it’s a TM that accepts the empty language.

Since a TM can be seen as a string, we can say that a TM accepts itself if it accepts the corresponding encoding of itself as a binary string. We can even define a language as the set of TM encodings that don’t accept themselves:

L_d = \{w | w \mbox{ is the encoding of a TM that doesn't accept itself} \}

So we want to know whether there is a TM T_d for L_d. Suppose there is. Then we ask: does T_d accepts itself?

The self-referencing property is the origin Russell’s Paradox. It essentially defines a set S that contains all elements (which can also be sets) that do not contain themselves. The paradox is that if S contains itself, then there’s a set in S that do contain itself, which is a contradiction. On the other hand, if it doesn’t contain itself, S left out a set, which also contradicts the definition of S.

We can use the same analogy for TMs. If T_d accepts itself, then it is accepting a TM that doesn’t comply with the constraints. If it does not, it left out the encoding of a TM that doesn’t accept itself.

The only assumption we made so far is that there is such TM for language L_d, so we can conclude this was a wrong assumption and thus L_d is not a recursive enumerable language.

The Universal Turing Machine

Question: Is there an algorithm to tell whether a Turing Machine accepts a given input w?

To answer this question, we’ll first define a special type of turing machine. The Universal Turing Machine (UTM) is a TM that takes as input a TM M encoded as string and an input w as a pair (M, w) and accepts it if and only if the corresponding TM accepts w. We denote by L_u the language of the UTM.

We can design the UTM as a TM with 3 tapes (see multiple tapes TM on the Extended Turing Machines section),

* Tape 1 holds the M description
* Tape 2 represents the tape of M
* Tape 3 represents the state of M

The high-level steps we need to carry over (and for which we have to define the proper transition function, but we’re not doing here) are:

1. Verify that M is a valid TM
2. Decide how many “bits” of the tape it needs to represent one symbol from M
3. Initialize Tape 2 with input w and Tape 3 with the initial state of M
4. Simulate M, by reading from Tape 1 the possible transitions given the state in Tape 3 and the input at Tape 2.
5. If the simulation of M halts accepting the string w, then the UTM accepts (M,w).

We can now answer the question in the beginning of this section by the following proposition:

Proposition: L_u is recursive enumerable, but not recursive.

Proof: It’s recursive enumerable by definition, since it’s defined by a TM (in this case the UTM). So we need to show that the UTM is not guaranteed to always halt, that is,.

Suppose there’s such an algorithm. Then we have an algorithm to decide whether a string is in L_d. First we check if w is a valid encoding of a TM. If it’s not, then we assume it represents a TM that defines an empty language, so obviously it doesn’t accept itself and thus should go into L_d.

Otherwise, we ask our hypothetical algorithm whether it accepts (w, w). If it does, then the TM corresponding to w accepts itself and should not go into L_d. On the other hand, if our algorithm doesn’t accept, it should go into L_d. We just described an algorithm that defines L_d, but in the previous section we proved that L_d has no TM that defines it. This contradiction means that our assumption that an algorithm exists for L_u is incorrect.

\square

This is equivalent to the halting problem, in which we want to know whether a program (a Turing machine) will halt for a given input or run forever.

Post Correspondence Problem

Question: What are other problems undecidable problems?

We can show a problem is undecidable by reducing an already known undecidable problem to it. In particular, if we can show a particular problem can be used to simulate an universal Turing machine, we automatically prove it undecidable. We’ll now do this with a fairly simple problem to describe and that has no apparent connection to turing machines.

The post correspondence problem (PCP) can be stated as follows. Given n pairs of strings of the form (a_i, b_i), 1 \le i \le n, the problem consists in finding a list of indexes R = \{r_1, r_2, \cdots, r_k\} such that concatenating the pairs in this order leads to the same string, that is, a_{r_1}a_{r_2} \cdots a_{r_k} = b_{r_1}b_{r_2} \cdots b_{r_k} (note that an index can be used more than once in the solution).

There’s a stricter version of this problem in which the first pair of the input must be the first pair in the solution, which we’ll refer to MPCP. This doesn’t make the problem harder because it’s possible to reduce it to the original PCP.

It’s possible to simulate a TM by solving the MPCP so that there’s a solution to the MPCP if, and only if, the TM accepts an input w. We provide the detailed reduction in the Appendix for the interested reader. This means that the if there exist an algorithm for PCP (and thus for MPCP), we have an algorithm for any TM, including the UTM, which is a contradiction, and thus we conclude that PCP is undecidable.

Many other problems can be proved to be undecidable by reducing the PCP to them. The list includes some decision problems regarding Context Free Grammars. For example, given a CFG, tell whether a string can be generated in more than one way (whether a CFG is ambiguous); or whether a CFG generates all the strings over the input alphabet; or whether a CFL is regular;

Conclusion

So far this is the second course I finish on coursera (the other one I did was the Probabilistic Graphical Models). I’m very glad that classes from top tier universities are freely available on the web.

Automata theory was one of the gaps in my computer science theory base, especially Turing machines. I’ve done an introductory course on theoretical computer science, but it was mostly focused on intractable problems and cryptography.

References

[1] Wikipedia – Jeffrey Ullman
[2] Slides from Automata Course – Coursera

Appendix

Proposition. It’s possible to decide whether an universal turing machine accepts a string by reducing it to MPCP.

Sketch of proof. We won’t give a formal proof of the reduction, but the idea is pretty neat and after understanding it, the reduction becomes more intuitive.

First thing, let’s define a snapshot from a simulation of a TM. A snapshot is a compact description of the TM in a particular simulation step. We can represent it by placing the needle to the left of the tape position it points to. We can omit both the right and left infinite endpoints that are composed eintirely by blanks.

Thus, the first snapshot is represented by q_0 w_1\cdots w_n. If we move the needle to the right writing X to the current position and changing the state to p, the snapshot becomes x p w_2 \cdots w_n. If instead from the first snapshot we moved to the left, writing X and changing the state to p, we would have p \square x w_2 \cdots w_n, where \square is the blank symbol and we need to represent it because it’s not part of the contiguous infinite blanks block anymore.

And each incomplete solution of the MPCP, we have the first part one step behind the second part. With that, we can “tie” to snapshots together by a pair, which can be use to encode the transition function.

The instance we’ll build for the MPCP will consist of the following classes of pairs:

First pair, represent the initial snapshot:

(1) (\#, \#q_0w\#)

Delimiter pair, to separate adjacent snapshots:

(2) (\#, \#)

Copy pair, to copy tape symbols that are not involved in the transition function. For each tape symbol X we create the pair

(3) (X, X)

Transition function pairs

For a move to the right, that is, for each \delta(q, X) \rightarrow (p, Y, R) we create the pair

(4) (qX, Yp)

For a move to the left, that is, for each \delta(q, X) \rightarrow (p, Y, R) we create the pair

(5) (ZqX, pZY)

In case X is the blank character, the right movement, that is \delta(q, B) \rightarrow (p, Y, R) will become:

(6) we create (q\#, Yp\#)

The left movement, that is \delta(q, B) \rightarrow (p, Y, L) will become:

(7) (Zq\#, pZY\#)

Final state pairs

For a final state f and for all tape symbols X, Y we add the following auxiliary pairs, from (8) to (10):

(8) (XfY, f)

(9) (fY, f)

(10) (Xf, f)

(11) (f\#\#, \#)

Simulation – Example

The solution will have to start with pair (1), because it’s the first pair and we’re solving the MPCP.

\#
\#q_0 w_1 \cdots w_n

We can now simulate a movement. For example, say we have a right move defined by \delta(q_0, w_1) \rightarrow (p, x, R) by applying the corresponding pair (4) (q_0 w_1, x p), which will lead us to the partial solution

\#q_0 w_1
\#q_0 w_1 \cdots w_n \# x p

for now, we can only use the pairs (3), because we need to match w_2, w_3, \cdots w_n, so we apply those pairs until we get

\#q_0 w_1 \cdots w_n
\#q_0 w_1 \cdots w_n \# x p w_2 \cdots w_n

Now, the only pair that can follow is (30, that is, (\#, \#), which will finish the snapshot:

\#q_0 w_1 \cdots w_n \#
\#q_0 w_1 \cdots w_n \# x p w_2 \cdots w_n \#

Note that the pairs of form (3) are only used to copy the remaining part of the tape. That’s because the transition function itself is local, in the sense that it doesn’t know about the tape values that are not involved its definition.

This movement is analogous for the other transition functions.

The remaining case is when the reach a final state, in which we want to accept my making the first part of the partial solution to catch up with the second part.

Say we are in this stage:

\cdots \#
\cdots \#ABfCDE\#

where f is a final stage. We first apply the copy pair (3) on A:

\cdots \#A
\cdots \#ABfCDE\#A

and then use pair (8) as (BfC, f) to get

\cdots \#ABfC
\cdots \#ABfCDE\#Af

we can only copy and finish the snapshot using (3) and (4):

\cdots \#ABfCDE\#
\cdots \#ABfCDE\#AfDE\#

we need the pair (8) again, now as (AfD, f) to get

\cdots \#ABfCDE\#AfD
\cdots \#ABfCDE\#AfDE\#f

again, by copying and finishing the snapshot we have,

\cdots \#ABfCDE\#AfDE\#
\cdots \#ABfCDE\#AfDE\#fE\#

now we use the pair (9) as (fE, f) to obtain

\cdots \#ABfCDE\#AfDE\#fE
\cdots \#ABfCDE\#AfDE\#fE\#f

ending another snapshot

\cdots \#ABfCDE\#AfDE\#fE\#
\cdots \#ABfCDE\#AfDE\#fE\#f\#

finally we use the closing pair (11)

\cdots \#ABfCDE\#AfDE\#fE\#f\#\#
\cdots \#ABfCDE\#AfDE\#fE\#f\#\#

Note that the final closing is carried over using several auxiliary snapshots that don’t actually correspond to the actual simulation, but it’s easy to obtain the right snapshots from a solution to the MPCP.

\square