Eulerian Circuits

111px-Leonhard_Euler

Leonhard Euler was a Swiss mathematician in the 18th century. His paper on a problem known as the Seven Bridges of Königsberg is regarded as the first in the history in Graph Theory.

The history goes that in the city of Königsberg, in Prussia, there were seven bridges connecting different mass of lands along the Pregel river (see Figure 1). The challenge was to find a path through the city that crossed the bridges exactly once. Euler showed that no such solution existed.

Interesting unrelated fact: Today Königsberg called Kaliningrad in Russia, and Kaliningrad is actually separated from Russia geographically, lying between Lithuania and Poland.

Konigsberg_bridges.png

Figure 1: Map of Königsberg and the seven bridges. Source: Wikipedia

The solution to the Seven Bridges of Königsberg problem eventually led to a branch of Mathematics known as Graph Theory. In this post we’ll be talking about the theoretical framework that can be used to solve problems like the Seven Bridges of Königsberg, which is known as Eulerian Circuits.

We’ll provide a general definition to the problem, discuss a solution and implementation, and finally present some extensions and variations to the problem.

Definition

Let G(V, E) be a connected undirected graph, where V is the set of vertices and E the set of directed edges, and where (v, w) denotes an edge between vertices v and w. The Eulerian circuit problem consists in finding a circuit that traverses every edge of this graph exactly once or deciding no such circuit exists.

An Eulerian graph is a graph for which an Eulerian circuit exists.

Solution

We’ll first focus on the problem of deciding whether a connected graph has an Eulerian circuit. We claim that an Eulerian circuit exists if and only if every vertex in the graph has an even number of edges.

We can see this is a necessary condition. Let v be a node with an odd number of edges. Any circuit traversing all edges will have to traverse v. Moreover, it will have to use one edge to “enter” v and one edge to “leave” v. Since this circuit can traverse each edge no more than one time, it will have to use different edges each time, meaning it needs 2 edges every time it crosses v. If there are an odd number of edges, one edge will be left unvisited.

To show this is sufficient, we can provide an algorithm that always finds an Eulerian circuit in a graph satisfying these conditions. Start from any vertex v and keep traversing edges, deleting them from the graph afterwards. We can’t get stuck on any vertex besides v, because whenever we enter an edge there must be an exit edge since every node has an even number of edges. Thus eventually we’ll come back to v, and this path form a circuit.

This circuit doesn’t necessarily cover all the edges in the graph though, nor it means that are other circuits starting from v in the remaining graph. It must be however, that some node w in the circuit we just found has another circuit starting from it. We can repeat the search for every such node and we’ll always find another sub-circuit (this is a recursive procedure, and we might find sub-sub-circuits). Note that after we remove the edges from a circuit, the resulting graph might be disconnected, but each individual component is still Eulerian.

Once we have all the circuits, we can assemble them into a single circuit by starting the circuit from v. When we encounter a node w that has a sub-circuit, we take a “detour” though that sub-circuit which will lead us back to w, and we can continue on the main circuit.

Implementation

We’ll use the algorithm first described by Hierholzer to efficiently solve the Eulerian circuit problem, based on the proof sketched in the previous session.

The basic idea is that given a graph and a starting vertex v, we traverse edges until we find a circuit. As we’re traversing the edges, we delete them from the graph.

Once we have the circuit, we traverse it once more to look for any vertices that still have edges, which means these vertices will have sub-circuits. For each of these vertices we merge the sub-circuit into the main one. Assume the main circuit is given by a list  of vertices (v, p_2, ... , p_k-1, w, p_k+1, ..., p_n-1, v) and w is a vertex with a sub-circuit. Let (w, q_1, ..., q_m-1, w) be the sub-circuit starting from w. We can construct a new circuit (v, p_2, ..., p_k-1, w, q_1, ..., q_m-1, w, p_k+1, ..., p_n-1, v).

Let’s look at a specific implementation using JavaScript (with Flow). The core of the algorithm implements the ideas discussed above:

The complete code is on Github.

Analysis

We’ll now demonstrate that the algorithm described above runs in linear time of the size of the edges (i.e. O(|E|)).

Note that find_circuit() is a recursive function, but we claim that the number of times the while() loop executes across all function calls is bounded by the number of edges. The key is in the function:

graph.getNextEdgeForVertex(vertex);

graph is a convenience abstraction to an adjacency list, where for each vertex we keep a pointer to the last edge visited. Because of this,  getNextEdgeForVertex() will visit each edge of the graph at most once and we never “go back”. Since the graph object is shared across all function calls (global), we can see that the number of calls to getNextEdgeForVertex() is bounded by O(|E|), so is the number of times all while() loops execute.

Now we just need to prove that every other operation in the while loop is O(1). The only non-obvious one is:

graph.deleteEdge(edge);

This is a lazy deletion, meaning that we just set a flag in edge saying it’s deleted and it will later be taken into account by callers like graph.getNextEdgeForVertex() and graph.getDegree(). Hence, this is an O(1) operation.

For getNextEdgeForVertex(), we must skip edges that have been deleted, so we might need to iterate over a few edges before we find an undeleted one (or none if the graph is not Eulerian – in which case we terminate the algorithm). Since we’re still always processing at least one edge in every call to getNextEdgeForVertex() the argument about the total calls being bounded by O(|E|) holds.

In order for getDegree() to be an O(1) operation, we need to keep a non-lazy count of the degree of a vertex, but we can do it in O(1) when deleting an edge.

Finally, let’s analyze the second loop. The number of iterations is proportional to the length of the circuit. Since every possible circuit found (including the ones found recursively) are disjoint, the total number of times we loop over the vertices from circuits (across all function calls) is also bounded by the number of edges.

We already saw getDegree() is O(1) even with lazy deletion. The remaining operation is

path.insertAtVertex(vertex, subPath);

if we store the paths as a linked list of vertices, inserting subPath at a given node can be done in O(1) if we keep a reference from each vertex to its last (any actually) occurrence in the path.

Directed Graphs

We can extend the definition of Eulerian graphs to directed graphs. Let G(V, A) be a strongly connected graph, where V is the set of vertices and A the set of directed edges, and where (v, w) indicate a directed edge from v to w. The Eulerian circuit problem for a directed graph consists in finding a directed circuit that traverses every edge of this graph exactly once or deciding no such circuit exists.

It’s possible to show that such a circuit exists if and only if the strongly connected directed graph has, for each vertex v, the same in-degree and out-degree. The algorithm is essentially the same.

Counting Eulerian Circuits in directed graphs

It’s possible to count the number of different Eulerian circuits in a directed graph. According to the BEST theorem (named after de Bruijn, van Aardenne-Ehrenfest, Smith and Tutte) [3], the number of Eulerian circuits in a directed graph can be given by [4]:

ec(G) = t_w(G) \prod_{v \in V}(deg(v) - 1)! (1)

Where deg(v) represents the in-degree (or out-degree) or a vertex v and t_w(G) is the number of arborescences rooted in a vertex w (simply put, an arborescence is analogous to a spanning tree for a directed graph – but we can only include edges that are directed away from the root).

It’s possible to show that t_w(G) is the same for any vertex w if G is Eulerian. We can compute t_w(G) via the Matrix-Tree theorem [2], which says t_w(G) is equal to the determinant of the Laplacian of G without vertex w. Let’s try to understand the idea behind this equation.

The mapping from an arborescence to an Eulerian path can be made by the following. Let r be the root of a possible arborescence of G. Now, let r be the reference starting point for an Eulerian path in G (note this is just for reference, since there’s no starting point in a circuit).

We say that an Eulerian path is associated with a given arborescence if for each vertex v, the last edge passing through v, say (v, v’), belongs to the arborescence. This is more clear with an example. Consider the digraph from Figure 2. Here we’ll consider the arborescences rooted in A.

graph.png

Figure 2: Directed Graph

This graph has 2 possible arborescences depicted on the left in Figures 3 and 4. In Figure 3, we can see that the edge (B, D) has to be visited before (B, C) because (B, C) is in the arborescence.

arborescence1.png

Figure 3: One of the arborescences of G and a corresponding Eulerian circuit

Now, in Figure 4, because it’s (B, D) that’s in the arborescence, it has to be visited after we visit (B, C).

arborescence2.png

Figure 4: Another of the arborescence of G and a corresponding Eulerian circuit

Note that there can be more than one Eulerian path to a given arborescence. If B had more out-edges, we’d have multiple choices, since the arborescence only specifies the last edge to be taken, not the intermediate ones. More specifically, imagine B had k out-edges. Then we could traverse the first k-1 in any combination of orders, which leads to a total of (k – 1)! ways of doing so.

The same applies to all other nodes. Due to properties of Eulerian circuits, the choice of the out-edge at a given node can be seen as independent of the choice at other nodes, so the total possible Eulerian circuits corresponding to any arborescence is given by the product of the degrees from equation (1), namely:

\prod_{v \in V}(deg(v) - 1)! (2)

The key property of categorizing Eulerian circuits into arborescence classes is that they’re disjoint, that is, a Eulerian circuit corresponds to exactly one arborescence. This, in conjunction with the fact that the vertices degrees in Equation (2) are from the original graph, and hence independent of a arborescence, lead us to the two independent factors in equation (1).

Counting Eulerian Circuits in undirected graphs

Counting Eulerian circuits in undirected graphs is a much harder problem. It belongs to a complexity class known as #P-complete. This means that:

  1. It belongs to the #P class, which can informally be seen as the counting version of NP problems. For example: deciding whether a given graph has an Hamiltonian circuit (path that traverses all vertices exactly once) is a problem in the NP class. Counting how many Hamiltonian circuits existing in that graph is the corresponding problem in the #P class.
  2. It belongs to the #P-hard class, which means that any problem in #P can be reduced to it via a polynomial-time transformation.

Valiant proved the first condition in [5] while Brightwell and Winkler proved the second in [6] by reducing another #P-complete problem (counting Eulerian orientations) to it.

Note that a problem in the #P class is as hard as the equivalent class in NP, because we can reduce a problem in NP to #P. For example, we can decide whether a graph has an Hamiltonian circuit (NP problem) by counting the number of circuits it has (#P problem). The answer will be “yes” if it the #P version returns a number greater than 0 and “no” otherwise.

Because the problem of counting Eulerian circuits in an undirected graph being in #P, we can conclude that there’s no efficient (polynomial time) algorithm to solve it unless P = NP.

Conclusion

In this post we covered Eulerian circuits in an informal way and provided an implementation for it in JavaScript. I spend quite some time to setup the JavaScript environment to my taste. I strongly prefer using typed JavaScript (with Flow) and using ES6 syntax. I decided to write it in JavaScript with the potential to create a step-by-step interactive tool to demonstrate how the algorithm works.

I was familiar with the concept of Eulerian circuits, but I didn’t remember the algorithms to solve it, even though I was exposed to one of them in the past. It was a good learning experience to write the code from scratch to really understand what I was doing.

This is the first time I see the #P complexity class. It’s always nice to learn about new theories when digging further on a specific topic.

References

[1] Bioinformatics Algorithms: An Active Learning Approach – Compeau, P. and Pevzner P.
[2] Matrix-Tree Theorem for Directed Graphs – Margoliash, J.
[3] Circuits and trees in oriented linear graphs – Aardenne-Ehrenfest, van T., Bruijn, de N.G.
[4] Wikipedia – BEST Theorem
[5] The complexity of computing the permanent – L. G. Valiant
[6] Counting Eulerian circuits is #P-complete – Brightwell, G. and Winkler, P.

Advertisements

Writing JavaScript using OCaml

output

In this post we’ll explore BuckleScript, a framework that enables developers to write JavaScript applications using OCaml syntax.

Note that BuckleScript is not a way to convert general OCaml code into JavaScript, but rather write JavaScript applications using OCaml’s syntax and type system. The end runtime and libraries are still JavaScript.

For example, if you want to work with dates, you wouldn’t look for the corresponding OCaml library, but instead include the JavaScript Date module in your OCaml code.

Let’s first compare BuckleScript with similar frameworks to understand better why it exists and when to use it. Then we’ll follow with several examples to get a better grasp of BuckleScript and how it translates to JavaScript.

It’s better to have some knowledge of OCaml, and we have written extensively about it, but it’s also easy enough to get the gist of the benefits from our simple examples.

Comparisons

BuckleScript vs js_of_ocaml

js_of_ocaml is another framework that connects the OCaml and JavaScript worlds.

According to [4] both projects aim to compile OCaml code to JavaScript. The differences pointed are:

* js_of_ocaml takes low-level bytecode from OCaml compiler, BuckleScript takes the high-level rawlambda representation from OCaml compiler
* js_of_ocaml focuses more on existing OCaml ecosystem(opam) while BuckleScript’s major goal is to target npm

A simplistic way to see the differences is that BuckleScript is for JavaScript developers to start using better language features from OCaml, while js_of_ocaml is for OCaml developers to be able to run their code in the browser.

BuckleScript vs ReasonML

ReasonML is often mentioned together with BuckleScript, which makes it a bit confusing to understand their differences at first.

Compared do OCaml, ReasonML has a friendlier syntax (for people coming from JS) and better support for JSX (inline XML tags). The difference in syntax is significant that we are really talking about some dialect of OCaml.

Note that BuckleScript and ReasonML are complementary. BuckleScript can compile either OCaml or ReasonML to JavaScript. ReasonML in mostly about the syntax.

BuckleScript vs TypeScript

Typescript is a framework for adding types to a JavaScript codebase, similar to Flow.

At first glance, TypeScript and BuckleScript seem to serve different purpose, but one of the main advantages of using OCaml to write JavaScript applications is to provide type safety.

In [2], Hongbo provides a comparison between the two systems. Some of the pros and cons raised are:

TypeScript

  •  Pros:
    • Designed for JS, easy inter-operate with JS
  • Cons:
    • Compiler slow (not scalable for FB/Google scale)
    • Verbose, very limited type inference
    • Start up time slow (very hard to build traditional build tools)
    • Types are only used for tooling – soundness is not the design goal, not very reliable
    • No code optimizations

BuckleScript

  • Pros:
    • Compiles much faster, scales better
    • Sound type system, global type inference
    • Types are used in code optimization, optimizing compiler
    • Outputs to JavaScript but can also generate native backend for X86, ARM
  • Cons:
    • Learning curve is higher compared with TypeScript

The author also suggests that OCaml is a better language than JavaScript, without providing more details.

Setup

The easiest way to try out examples is through this BuckleScript playground.

To try it locally, we can follow [5]. It generates an initial bsconfig.json file and the compilation process can be done via npm. The process involve converting a OCaml file (.ml) to a JavaScript file (.bs.js). The latter should be included in your application, but both should be considered part of your codebase and hence committed.

Examples

Because most people learning BuckleScript are familiar with JavaScript and less familiar with OCaml, I’ll provide the reverse examples: how to do implement a given JavaScript snippet of code in OCaml?

console.log(‘Hello World’)

How to print to stdout using BucketScript? The most basic program is simply printing a Hello World. In OCaml we would do “Print.printf” to print to stdout, but OCaml modules are not readily available for BuckleScript. We can use the Js.log() function for this purpose:

This maps to:

Note that Js log is a library provided by BuckleScript to make the integration with JavaScript more seamless.

Looping over an Array

We can use an imperative-style code in OCaml:

Note that arrays have a different syntax than in JavaScript: [| 10; 20; 30 |]. If we generate this code we get:

The interesting thing here is that we do have access to some basic OCaml libraries, for example Array.

Since we are dealing with a functional language, we might as well use some more idiomatic OCaml code. If we are to translate the code to native JavaScript structures, we have to use functions from the Js module:

Which maps to

Looping over a List

So Array is the basic structure to represent a sequence of items in JavaScript but in OCaml it’s List. What happens if we use List instead?

We simply dropped the | to use a List instead of Array and now we can use the more standard fold_left instead of reduce. This translates to:

This is very interesting! We studied functional data structures OCaml extensively in the past and the key concept is that by default data structures are persistent. If we look closely what is being passed to sum(), we see it’s a linked-list like structure: [10, [20, [30, 0]]], the 0 being used to indicate the end.

Modules

Another common pattern in JavaScript is to group a bunch of functions inside an Object. The Object name serves as namespace or a module name that can be referenced elsewhere. A natural mapping here are the OCaml modules:

which maps to:

Note that the signature is only used for compilation/transpilation checks and is erased from the final JavaScript code. Another curious thing is that the functions are exported as arrays to MyModule. To me it would make more sense to export them as Object.

Currying Functions

A nice feature from functional languages is the concept of currying, which allow us to perform partial application of functions. For example, if we have a function that adds two numbers, we can derive an increment function that partially applies sum by binding the first value to 1:

The resulting JavaScript code is:

Note that we can already perform partial application of function in vanilla JavaScript via bind(), but the syntax is not as neat:

Chaining Functions

Another neat syntax from OCaml is the chaining operator. One can chain functions via the |> operator: the result of the lefthand function is passed as argument to the righthand function.

A common use for this is when we want to apply a series of function in sequence without assigning to a variable.

For example, if we have a function to normalize a given string by converting it to lower case, trimming the leading and trailing whitespaces and also converting intermediate spaces to underscores, we could write, in JavaScript:

An alternative would be to nest the calls so we don’t have to repeat the variable, but that would hurt legibility. In OCaml, we could chain these calls:

Note the [%bs.re] tag. It is a macro that allows us to write regexes using JavaScript syntax. We can avoid repeating the module names if they are all the same:

Using JavaScript libraries

One of the main selling points of BuckleScript is that you can adopt it gradually, module by module. This is possible because we can require JavaScript modules inside our OCaml code. For example, if we were to convert the following code that reads a file asynchronously in JavaScript:

We could do:

Here, the OCaml code is more verbose but we provided a stronger contract by typing the function readFile(). The syntax for importing modules is

Note: if is the same as , the latter can be omitted.

Objects as Maps

In JavaScript Objects are often used either as maps (key-value) or records (entries of distinct fields). In OCaml we can rely on types to enforce the specific use we want via types. In the example below, we declare a map with type string to int. If we try to set a value with a different type we get a compilation error:

Objects as Records

To represent an Object as a record, we can use a OCaml record type syntax:

We added the [@bs.optional] to indicate that a field is optional. We also added the [@@bs.deriving abstract] attribute to indicate it should not be directly instantiated like

Instead, it generates a “constructor” function. In the same way, the properties of a record are not directly available. They also need to go through intermediate auto-generated accessors:

The generated JavaScript code translates to an Object:

The interesting thing is that the generated JavaScript Object is mutable, but within the OCaml code, the record cannot be modified. It’s possible to mark it mutable, but the default immutability makes it easier to reason about code.

Downsides

The benefits being stated, there are two main potential drawbacks of using BuckleScript.

Mixed languages. Adopting BuckleScript will cause the codebase to have a mix of different languages, which makes it harder for new developers to ramp up. Of course this can be mitigated by converting the entire codebase to use OCaml.

Debugging. We’ll be writing code in on language but it’s another language that will end up being executed. If a problem happens in the underlying JavaScript code, how to figure out which OCaml code is generating the faulty code?

BuckleScript tries to solve this issue by preserving the structure of the code as much as possible so that it’s easier to understand what parts maps to what. This works well if we are using the Js wrappers that resembles the JavaScript code patterns, but it’s unclear how easy the structure is preserved if we use more of OCaml persistent data structures or functional programming patterns like currying.

One possible improvement would be to add some traceability to the generated JavaScript code such that you won’t need to look at the JavaScript code most of the time, in the same way that one doesn’t usually need to inspect Assembly code when their C++ application crashes.

Conclusion

In this post we did a comparison of BuckleScript with different frameworks and libraries to understand why and when to use it. Following that, we studied a few basic examples which one might encounter in a day-to-day JavaScript programming and how to express that in OCaml.

Through these examples, we saw the OCaml type system in use, as well as some neat syntax and immutable data structures, which can lead to more readable, succinct and reliable code.

References

[1] BuckleScript Examples
[2] BuckleScript: An OCaml to JavaScript compiler (Hacker News discussion)
[3] BuckleScript playground
[4] BuckleScript/bucklescript – Comparisons
[5] BuckleScript – New Project

Monitoring System using Raspberry Pi

A few months ago, I described how to use a Raspberry Pi to setup a domestic server. In this post we’ll see how to take snapshots using the Pi and a camera, so combining it with the server, we can create a simple monitoring system accessible from a phone.

DSC_0491

Node.js server

First thing we have to do is to run another server, separate from nginx to handle the requests. nginx is a very basic server that can’t do much else, but it’s capable of forwarding requests/responses.

I chose Node.js because I’m familiar with JavaScript and haven’t used much Node.js, such it was a good learning opportunity.

To install it, first we need to find out which architecture our Pi’s processor is. The Raspbian has the command arch which tells returned armv6l in my case. This tells us which version to pick from the Node.js site. We can install it manually by downloading and unpacking it:

cd ~/Downloads
wget https://nodejs.org/dist/v6.9.2/node-v6.9.2-linux-armv6l.tar.xz
tar xf node-v6.9.2-linux-armv6l.tar.xz

We can put it in a local directory (no need for sudo). We move it to a convenient place:

mv ~/Downloads/node-v6.9.2-linux-armv6l workspace/node

and then make sure that path is being looked for when running executables:

export PATH="$HOME/workspace/node/bin:$PATH"

Now we need to tell nginx to forward requests to another IP address, which our Node.js server will be listening to, by adding this to `/etc/nginx/sites-available/default`:

Finally we create a very simple Node.js application that listens to localhost (127.0.0.1), port 3000, and servers an HTML file as response:

Camera

I got the Raspberry PI 5MP Camera Board Module to take the still photos. It’s an official product from Raspberry pi, so it comes with some nice integrations.

In particular, I was interested in the raspistill utility.

It has a few parameters and image transformation options, and we can specify an output file to which an image will be saved in JPEG format.

raspistill -w 600 -h 400 -rot 180 -o images/image.jpg

Putting things together

The basic idea of this system is that whenever we hit our webpage, it will take a snapshot using the camera and display the image.

Because it takes some time for a photo to be taken, we do it asynchronously. In the meantime it shows the latest snapshot it had on file.

To avoid overloading the Pi we debounce snapshot commands and only take a new snapshot after at least a minute since the previous one. I wrote a bash script to handle that:

Now we just need out Node.js app to call this script asynchronously:

Conclusion

This simple project was inspired by a friend, who used a Raspberry Pi + camera to monitor his apartment during his vacation. I copied the idea with the intention to learn about the process and put the Pi I bought a couple of years back to use.

I haven’t actively used this system, but it was fun to work on it and I’m now looking forward on working on other “home” projects.

Notes on JavaScript Interpreters

In a previous post, Notes on how browsers work, we studied the high-level architecture of a browser, specifically the Rendering Engine. We used the following diagram,

In this post we’ll focus on the JavaScript interpreter part. Different browsers use different implementations of the interpreters. Firefox uses SpiderMonkey, Chrome uses V8, Safari uses JavaScriptCore, Microsoft Edge uses Chakra, to name a few. V8 is also used as a standalone interpreter, most notably by Node.js.

These interpreters usually comply to one of the versions of the ECMAScript, which is a standardization effort of the JavaScript language. ECMA-262 is the document with the specification. As it happens with other languages, from their first inception, design flaws are identified, new development needs arise, so the language spec is always evolving. For that reason, there are a few versions of ECMAScript. Most browsers support the 5th edition, also known as ES5.

There’s already the 7th edition (as of 2016), but it takes time for browsers to catch up. Because of that, JavaScript programs that are capable of translating newer specs into ES5 were created, such as Babel. The advantage of using this technique, also known as transpiling, is that developers can use newer versions of the step, such as ES6, without depending on browser support. Disadvantages include the extra complexity by adding a new step in the deployment process and it makes harder to debug since it’s hard to map errors that happen in the transformed code back to the original source.

Section 8.4 of the ECMA-262 describes the execution model of JavaScript:

A Job is an abstract operation that initiates an ECMAScript computation when no other ECMAScript computation is currently in progress. A Job abstract operation may be defined to accept an arbitrary set of job parameters.

Execution of a Job can be initiated only when there is no running execution context and the execution context stack is empty. A PendingJob is a request for the future execution of a Job. A PendingJob is an internal Record whose fields are specified in Table 25. Once execution of a Job is initiated, the Job always executes to completion. No other Job may be initiated until the currently running Job completes. However, the currently running Job or external events may cause the enqueuing of additional PendingJobs that may be initiated sometime after completion of the currently running Job.

MDN’s Concurrency model and Event Loop [2] describes this spec in a more friendly way. As in other programming environments such as C and Java, we have two types of memory available: the heap and the stack. The heap is the general purpose memory and the stack is where we keep track of scopes for example when doing function calls.

In JavaScript we also have the message queue, and for each message in the queue there is an associated function to be executed.

JavaScript Execution Model

The event loop

The execution model is also called the event loop because of the high-level working of this model:

When the stack is empty, the runtime processes the first message from the queue. While executing the corresponding function, it adds an initial scope to the stack. The function is guaranteed to execute “atomically”, that is, until it returns. The execution might cause new messages to be enqueued, for example if we call

The callback passed as the first argument to setTimeout() will be enqueued, not stacked. On the other hand, if we have a recursive function, such as a factorial:

The recursive calls should all go to the same stack, so they will be executed to completion, that is until the stack is empty. The calls provided to setTimeout() will get enqueued and be executed only after the value of the factorial gets printed.

In fact, using setTimeout() is a common trick to control the execution order of functions. For example, say that function A called B and B calls another function C. If B calls C normally, then C will have to finish before B returns, as the sample code below:

In case we want to finish executing A first, then B can call C using setTimeout(C, 0), because then it will be enqueued, and then both A and B will finish until a new message is processed from the queue, as the code below:

Web Workers

We discussed Web Workers in a previous post, in which we mentioned that it’s a separate thread that shares no memory with the main thread. In fact, according to [2], it has its own stack, heap and queue. It doesn’t violate the execution model of the main thread because communications must be done via a publisher/subscriber API, which means communicating between two threads is subject to the queueing.

V8

Beyond the spec, each JavaScript engine is free to implement feature the way they want. V8’s architecture is described in their wiki page [3], so we can study some of its key components.

Fast Property Access

In JavaScript, structures like Object can be mutated (added and removed properties) in runtime. Implementing them as hashtables can lead to performance problems when accessing properties of these structures. Compare that to compiled languages like Java in which instances of a class can be allocated with all its members in a single chunk of memory and accessing properties of objects consists in adding an offset to the object’s pointer.

V8 optimizes the Object allocation by creating hidden classes. It makes use of the fact that properties are mutated in the same pattern. For example in

In this case, we always insert the property x and then y. V8 starts with an empty class C0 when the object is first created. When x is assigned, it creates another class C1 with property x and that points to C0. When y is assigned, it creates yet another class C2 with property y that point to C1.

When the Point constructor is finished, it will be an instance of C2. It has to instantiate 3 objects one from C0, then C1, then C2.

Accessing property y is an adding an offset operation, while accessing property x is two offset operations, but still fast than a table lookup. Other instances of Point will share the same class C2. If for some reason we have a point with only x set, then it will be an instance of class C1. This sharing of structures resembles the persistent data structures that we studied previously.

Another interesting property of this method is that it doesn’t need to know in advance whether the code has indeed a pattern in mutating objects. It’s sort of a JIT compilation.

Dynamic Machine Code Generation

According to [3]:

V8 compiles JavaScript source code directly into machine code when it is first executed. There are no intermediate byte codes, no interpreter.

Efficient Garbage Collection

We wrote about V8’s memory management in a previous post.

Conclusion

In this post we revisited a bit of the history of JavaScript and ECMAScript. We also studied the event loop model that is part of the spec and finally saw some aspects of one of the JavaScript engines, V8.

References

  • [1] ECMAScript 2016 – Language Specification
  • [2] MDN: Concurrency model and Event Loop
  • [3] V8 – Design Elements

Web Workers

In this post we study Web Workers, a technology that allows JavaScript code to run in separate threads. We’ll start by exploring the API with some toy examples and at the end discuss some applications.

Introduction

By default JavaScript runs in a single (the main thread), which can be a problem for user experience if expensive operations need to be performed in code which would affect responsiveness of the UI.

The thread that runs the web worker code has some environment limitations, which includes no access to the DOM or global objects such as window. [3] contains a list of functions and methods available to a Web Worker.

Besides that, memory is not shared between the threads, having to be explicitly serialized (so it can be cloned) and passed via a method (postMessage()). This can lead to performance issues if the amount of data to be copied is large. In Transferable Objects we’ll discuss an alternative.

Workers might spawn their own workers.

For some reason, the term "web workers" reminds me of these creatures from Spirited Away

For some reason, the term “web workers” reminds me of these creatures from Spirited Away :)

Let’s work a simple example. Imagine we have two files, main.js and worker.js (in the same directory):

main.js:

// Initializes the worker with the
// JavaScript code in worker.js
var myWorker = new Worker("worker.js");

// Post a message to the worker
myWorker.postMessage("Hello worker!");

// Callback that gets called when a 
// message is received from the worker.
myWorker.onmessage = function(/*MessageEvent*/ message) {
  console.log(
    'Message received from worker script: ', 
    message.data
  );
}

worker.js

onmessage = function(/*MessageEvent*/ message) {
  console.log(
    'Message received from main script: ', 
    message.data
  );
  postMessage("Hello main!");
}

Transferable Objects

By default data is copied when sending information back and forth between the main thread and the worker, using a process called structural cloning.

The serialization/deserialization can be expensive, for which case there is an alternative: transferrable objects. More specifically, we can work with ArrayBuffers, which can be “transferred” instead of cloned, which is a more performant operation, more precisely, O(1).

We’ll first cover ArrayBuffers and then see how to apply it in a context of Web Workers.

ArrayBuffer

According to [5]:

The ArrayBuffer is a data type that is used to represent a generic, fixed-length binary data buffer. You can’t directly manipulate the contents of an ArrayBuffer; instead, you create a typed array view or a DataView which represents the buffer in a specific format, and use that to read and write the contents of the buffer.

ArrayBuffer basically represents an unstructured array of bits, which, to have any meaning/interpretation, needs a view, which can be an array of 32-bit unsigned ints or 16-bit unsigned ints. In the example below we create an array buffer of 100 bytes.


// Number of bytes or number of elements
var buffer = new ArrayBuffer(100);

// A 32-bit unsigned int array of length 10 (i.e. 40 bytes), starting 
// from byte 0 at the array buffer
var int32View = new Uint32Array(buffer, 0, 10);
// A 16-bit unsigned int array of length 20 (i.e. 40 bytes), starting 
// from byte 0 at the array buffer
var int16View = new Uint16Array(buffer, 0, 20);

// Fill in the 16-bit array with 0, 1, 2...
for (var i = 0; i < int16View.length; i++) {
  int16View[i] = i;
}

// The memory is shared because we're reading from the same chunk of 
// the byte array.
for (var i = 0; i < int32View.length; i++) {
  console.log("Entry " + i + ": " + int32View[i]);
}

This is a very interesting model. In ArrayBuffers one explicitly work with the serialized form of the data and create views on top of them. I’m used to work with the views-first, that is, create a class representing some data and eventually add serialization/deserialization methods. One advantage of working with serialized data is that we don’t need to write the serialization methods, only the deserialization. The major disadvantage is that you need to know upfront how much memory you’ll have to use.

Transferrable objects

We can extend the example above to be used between a worker and the main thread.

worker.js:

var buffer = new ArrayBuffer(100);
var int16View = new Uint16Array(buffer, 0, 20);

for (var i = 0; i < int16View.length; i++) {
  int16View[i] = i;
}

console.log('array buffer size', buffer.byteLength); // 100
postMessage(buffer, [buffer]);
console.log('array buffer size?', buffer.byteLength); // 0

and in the main.js:

...
myWorker.onmessage = function(e) {
  buffer = e.data;
  // Number of bytes or number of elements
  var int32View = new Uint32Array(buffer, 0, 10);

  for (var i = 0; i < int32View.length; i++) {
    console.log("Entry " + i + ": " + int32View[i]);
  }
}

By logging the output to the console, we can see the main thread received the values written to the array buffer by the worker and after the worker transferred the buffer data, it was emptied out.

Note in the postMessage() API, we provide buffer as a the first parameter and then it also appears in the list indicating it should be transferred, not copied. Having to pass it twice is a bit confusing in my opinion, but this is to allow the example below, in which the objects transferred are nested inside another structure (in this case an object) and we want to transfer both buffer1 and buffer2 but not the top-level object. I’m not sure which use case the API designers had in mind, though.

postMessage(
  {'key1': buffer1, 'key2': buffer2}, 
  [buffer1, buffer2]
);

Error Handling

If any errors are uncaught by the worker, it can be caught from the main thread through the onerror callback:

myWorker.onerror = function(e) {
  console.log('an error occurred:', e);
  e.preventDefault();
}

Where e is an instance of ErrorEvent. We can simulate an error on the worker.js code:

throw new Error("Some error occurred");

Terminating

The main thread can terminate the worker

worker.terminate();

or the worker can terminate itself:

close();

Applications

A lot of the examples using Web Workers involve doing some fake expensive calculation in the worker thread, but I haven’t found any real-world application.

StackOverflow offers some ideas, including one that is dimissed as bad uses of Web Workers (polling) or from projects that are long defunct (Mozilla Skywriter). The main issue is that most of time heavy processing is done on the server.

One idea that came to mind is to use web-workers in React. React defers a lot of DOM work to the end by working with the concept of a virtual DOM. Web-workers don’t have access to the DOM but they do have it for virtual DOMs. Turns out this idea has been explored already [7, 8] but there were some technical difficulties in implementing events.

Conclusion

In this post we studied Web Workers and some examples utilizing it. I learned a few other related topics like ArrayBuffers, and Compositor Workers. I was a bit disappointed with the lack of compelling applications using Web Workers. I’ll try it out in some of my projects and see if I can any benefits from it.

References

Some of the code presented in this post is available on Github.

[1] MDN – Using Web Workers
[2] HTML5 Rocks – The Basics of Web Workers
[3] MDN – Functions and classes available to Web Workers
[4] Google Developers – Transferable Objects: Lightning Fast!
[5] MDN – JavaScript typed arrays
[6] StackOverflow – What are the use-cases for Web Workers?
[7] React Custom Renderer using Web Workers
[8] GibHub React Issues: Webworkers #3092
[9] Compositor Workers Proposal

Developing a project in Javascript

I’ve worked with several small Javascript side-projects. The amount of Javascript libraries and frameworks is overwhelming, especially in recent times.

In the past, I would write a couple of stand-alone Javascript files from scratch. As applications get bigger and more complex, new libraries for improving project development have been created.

I decided to look around for best practices to develop open source JavaScript applications these days. This post is a set of notes describing my findings.

We’ll discuss libraries that solves different needs for software projects including libraries, modularization, automated building, linter and finally testing frameworks.

Packages/Libraries

npm-logo

Javascript doesn’t have an official package management. There has been an effort to standartize how Javascript libraries are distributed. With Node.js, came its package manager that npm (node package manager), that was initially indented for Node.js packages, but can also be used for general libraries, independent of Node.js itself.

To work with npm, we need to write a configuration file, called package.json. In this file, which is a JSON, we can define metadata when building a library, including title, version and the dependencies of other libraries. A sample configuration looks like this:

{
  "name": "my_project",
  "version": "0.0.1",
  "description": "Description of my project",
  "devDependencies": {
    "browserify": "~5.11.0",
    "uglifyify": "^2.5.0"
  }
}

Dependencies

In the dependencies, we have to specify the versions. A version (more specifically semantic version or semver) consists of three parts numbers separated by '.'. The last number should be bumped on small changes, like bug fixes, without change on functionality. The middle number, aka minor version, should be bumped whenever new features are added, but that are back-compatible. Finally, the first number, aka major version, should be bumped whenever back-incompatible changes are made [1].

In package.json, you can specify a hard-coded version number or be more relaxed. If we use the '~' in front of the version, for example ~5.11.0, it means we accept the most recent version of form 5.11.x. On the other hand, if we use the '^', for example ^2.5.0, we accept the most recent version of the form 2.x.x.

The dependencies of a package can be either production or development dependencies. In our case, browserify and uglify are only used for building our package and not a dependency our code has, so it doesn’t make sense to ship those to the user of the library.

To parse the configuration in package.json, we can run:

npm install --save-dev

This will download the dependencies listed under devDependencies locally in the directory node_modules (created in the same directory the package.json is). To run the production dependencies, we can do:

npm install --save

Modules

Modules are useful for splitting code in related units and enables reuse. JavaScript doesn’t have a native module system, so some libraries were built to address the modularization problem. There are three main types of module systems around: AMD (Asynchronous Module Definition), CommonJS and the ES6 loader. Addy Osmani discusses the differences between those in [2].

There are several implementations for modules, including RequireJS (AMD), browserify (uses the node.js module system, which uses CommonJS). SystemJS is able to work with all these different types.

I had been working with browserify, but it seems better to adopt the ES6 standards, so I’ve switched to SystemJS. Another advantage of SystemJS is that is also allows ES6 syntax by transpiling the code using BabelJS.

To use SystemJS we need to define a configuration file (analogous to package.json), named config.js (don’t worry about it for now).

Exporting

Named exports. We can have multiple export statements within a single file or provide all exports within a single statement [3]. Example:

/** 
 * my_module.js
 */
function foo() {
  console.log('foo');
}
function bar() {
  console.log('bar');
}
// Nested export
export {
  foo,
  // baz is what will be available externally
  bar as baz,
};
// Flat, inline export
export function derp() {
  console.log('derp');
}

Default exports. We can export default items in a module (the reason will be clear when we talk about importing next). We show the syntax for both the inline and the named exports:

// Nested 
export {
  foo as default,
};
// Inline export
export default function() {
  console.log('derp');
}

Importing

We have 3 basic ways to import from a module.

1. Name all items we want to pick from the module.

import {foo, baz as 'bar'} from 'my_module';

2. Do not provide any specific item, in which case we’ll import the default export:

import the_default_export from 'my_module';
// Equivalent to
import {default as 'the_default_export'} from 'my_module'

3. Import all item from the module under a ‘namespace’, basically

import * as everything from 'my_module'
// 'everything' is an object 
// {
//    foo,
//    baz,
//    default,
//    derp
// }

NPM Packages

To be able to import NPM packages, we have to download them first and for that we can use the jspm.io tool. For example, I was interested in the point-in-polygon package. Instead of running the npm command, we can use jspm:

// Download jspm 
npm install --global jspm
// Install the desired npm package
jspm install npm:point-in-polygon

Running jspm will write to the config.js file (it creates one if it doesn’t exist). This will write a map from where the module got installed and the name you can use in code to import it. Since npm packages use the CommonJS syntax and SystemJS understands it, in code we can simply do:

import {default as pointInsidePolygon} from 'point-in-polygon';

Building

The process of running commands like SystemJS can be automated. One idea is writing Makefiles to run command line. Another option is to use JavaScript frameworka, such as Grunt and Gulp. In this post we’ll stick to Grunt.

To configure a build, we need to provide another configuration file, called Gruntfile.js (should live in the same directory as the package.json). You provide an object to grunt.initConfig(), which contains tasks configurations.

module.exports = function(grunt) {

  var taskList = ['systemjs', 'uglify'];

  grunt.initConfig({
    pkg: grunt.file.readJSON('package.json'),
    systemjs: {
        options: {
            'configFile': './config.js'
        },
        dist: {
            'src': '<root JS file>',
            'dest': '<compiled file with all dependencies together>'
        }
    },
  });

  grunt.loadNpmTasks("grunt-systemjs-builder");
  grunt.registerTask('default', ['systemjs']);
};

With grunt.registerTask('default', ['systemjs']) we’re telling grunt to run the systemjs task whenever we run grunt from the command line.

It’s possible to run grunt automatically upon changes to JS files via the watch task. First, we need to install the plugin:

npm install grunt-contrib-watch --save-dev

Then we configure it in Gruntfile.js:

grunt.initConfig({
    ...
    watch: {
      browser_js: {
        files: ['**/*.js', '!node_modules/**/*', '!dist/**/*'],
        tasks: taskList,
      }
    },
});
...
grunt.loadNpmTasks('grunt-contrib-watch');

Here taskList is an array of task names. It can be the same one provided to the default task. Make sure to blacklist some directories like dist, which is the output directory of the systemjs task (otherwise we’ll get an infinite loop). Finally we run:

grunt watch

Now, whenever we perform a change to any JS file it will run the task.

Minification

Since Javascript code is interpreted on the client (browser), the source code must be downloaded from the server. Having a large source code is not efficient from a network perspective, so often these libraries are available as a minified file (often with extension min.js to differentiate from the unminified version).

The source code can be compressed by removing extra spaces, renaming variables, etc, without changing the program. One popular tool to achieve this is UglifyJS.

To use it with Grunt, we can install the grunt-contrib-uglify module:

npm install grunt-contrib-uglify --save-dev

And in our Gruntfile.js:

grunt.initConfig({
    ...
    uglify: {
        compact: {
            files: {
                './dist/<project>.min.js': ['./dist/<project>.js']
            }
        }
    },
    ...
}
grunt.loadNpmTasks('grunt-contrib-uglify');
grunt.registerTask('default', ['systemjs', 'uglify']);

Linting

Lint tools help us avoiding bugs, sticking to code conventions and improving code quality. One popular tool for linting is jshint. Other alternatives include jslint. JSHint has a Grunt plugin:

grunt.initConfig({
    ...
    jshint: {
        files: [
            '**/*.js',
            '!node_modules/**/*',
            '!jspm_packages/**/*',
            '!dist/**/*'
        ],
        options: {
            'esnext': true,
        }
    },
    ...
}
grunt.loadNpmTasks('grunt-contrib-jshint');
grunt.registerTask('lint', ['jshint']);

The basic configuration here makes sure to blacklist “production” directories like node_module and dist. Also, since we’ve been adopting ES6, we can set the esnext flag to tell jshint to account for the new syntax.

We probably don’t want to run the lint every time we update the JS file. We can run it less often, for example before sending code for review. Thus, we can create a separate registry for it using grunt.registerTask('lint', ['jshint']). We can now run jshint via the command line:

grunt lint

Testing

Another practice to avoid bugs is testing, including unit tests. Again, there are several libraries and frameworks that makes the job of unit testing less painful, for example easy ways to mock dependencies so we can test isolated functionality. In this case, I’ve picked Jest, which has a grunt task available in npm, which we can install via:

npm install grunt-jest --save-dev

(NOTE: this will also install the jest-cli binary which depends on a Node.js version >= 4, so you might need to update your Node.js).

We can configure the grunt task with default configs in the following way:

grunt.initConfig({
    ...
    jest: {
    },
    ...
}
grunt.loadNpmTasks('grunt-jest');

With this setup we can run the following command to run jest tests:

grunt jest

Unfortunately, jest uses the CommonJS require syntax. It used to be possible to use babel-jest but after version 5.0 this setup doesn’t work anymore.

Conclusion

The JavaScript environment changes extremely fast and it’s very hard to keep on top of the latest frameworks/practices, etc.

To make things worse, for every task like module system, linting, testing, there are many alternatives and none of them is a clear best choice.

I’m happy that there’s an effort of standardization with ES6. I think the more we stick to one convention the more we reduce re-inventing the wheel, the less syntax differences to learn, etc.

References

[1] Semantic versioning and npm
[2] Writing Modular JavaScript With AMD, CommonJS & ES Harmony
[3] ECMAScript 6 modules: the final syntax

Further Reading

Generation Javascript. Manuel Bernhardt discusses the current state of JavaScript libraries and how the low friction nature of developing in JavaScript has its downsides.

Essential JavaScript Links. Eric Elliott’s list of links to several books, articles and tools concerning to JavaScript. It provides a much more comprehensive list of options for the different topics we covered in this post.

Notes on how browsers work

In this post, I’d like to share some notes from articles about how modern browser works, in particular the rendering process.

All the aforementioned articles are based on open source browser code, mainly Firefox and Chrome. Chrome uses a render engine called Blink, which is a fork to Webkit used by other browsers like Safari. It seems to be the one with more documentation, so we’ll focus on this particular engine. The main sources are [1, 2 and 3]. They cover this subject in much more depth, so I recommend the read for those interested in more details.

The Big Picture

First, let’s take a look in an high-level architecture of a browser:

layers

As we said before, our focus will be on the Rendering Engine. According to [1], it consists of the following steps:

flow

and we’ll use these steps as our sections to come.

1. Parsing HTML to construct the DOM tree

This step consists in converting a markup language in text format into a DOM tree. An interesting observation is that HTML is not a Context Free Grammar because of its forgiving nature, meaning that parsers should accept mal-formed HTML as valid.

The DOM tree is a tree containing DOM (Document Object Model) elements. Each element corresponds to a tag from the HTML markup.

As we parse the HTML text, we might encounter the tags that specify two common resources that enhance basic HTML: CSS and JavaScript. Let’s cover those briefly:

Parsing CSS

CSS is a Context-free Grammar, so Webkit is able to rely on tools like Flex (lexical analysis generator) and Bison (parser generator) to parse the CSS file. The engine uses hashmaps to store the rules, so it can perform quick lookups.

Parsing Javascript

When the parser encounters a script tag, it starts downloading (if it’s an external resource) and parsing the JavaScript code. According to specs, downloading and parsing occurs synchronously, blocking the parsing process of the HTML markup.

The reason is that executing the script might trigger the HTML body to be modified (e.g. via document.write()). If the JavaScript doesn’t modify the HTML markup, Steve Souders suggests moving the script tags to the bottom of the page or adding the defer attribute to the script tag [4]. He has two test pages to highlight the load times for these distinct approaches: bottom vs. top.

In practice, according to Garsiel [1], browsers will do speculative parsing, trying to download script files in parallel to the main HTML parsing. This process does not start though until all stylesheet files are processed.

2. Render tree construction

While constructing the DOM tree, the browser also builds another tree called render tree. Each node in this tree, called render object, represents a rectangular area on the screen.

There’s not necessarily a 1:1 correspondence between DOM nodes and render nodes. For example a select tag has multiple render nodes, whereas hidden DOM elements (with the CSS property display set to none) do not have a corresponding render node.

Since each node represents a rectangle, it needs to know its offset (top, left) and dimensions (height, width). These values depend on several factors, including the CSS properties like display, position, margin and padding and also the order in which they appear in the HTML document.

The process of filling out these parameters is called the layout or reflow. In the next section we’ll describe this process in more details.

3. Layout of the render tree

Rectangle size. For each node, the size of the rectangle is constructed as follows:

* The element’s width is whatever value is specified in the CSS or 100% of the parent’s width
* To compute the height, it first has to analyse the height of its children, and it will have the height necessary to enclose them, or whatever value is specified in the CSS.

A couple of notes here: the height is calculated top-down, whereas the width is calculated bottom-up. When computing the height, the parent only looks at the immediate children, not descendants. For example, if we have

<div style='background-color: green; width: 400px'>
  A
  <div style='background-color: red; width: 500px; height: 100px'>
    B
    <div style='background-color: blue; height: 150px'>
      C
    </div>
  </div>
</div>

The green box (A) will have the height enough to contain the red box (B), even though the blue box (C) takes more space than that. That’s because B has a fixed height and C is overflowing it. If we add the property overflow: hidden to B, we’ll see that box A is able to accommodate B and C.

Some properties may modify this default behavior, for example, if box C is set to position absolute or fixed, it’s not considered in the computation of B’s height.

Rectangle offset. To calculate the offset, processes the nodes of the tree in order. Based on the elements that were already processed, it can determine its position depending on the type of positoning and display the element has. If it’s display:block, like a div with default properties, it’s moved to the next and the left offset is based on the parent. If it’s display is set to inline, it tries to render in the same line after the last element, as long as it fits within the parent container.

Some other properties besides display can also change how the position is calculated, the main ones being position and float. If position is set to absolute and the top is defined, the offset will be relative to the first ancestral of that component with position set to relative. The same works for the property left.

Whenever a CSS changes happens or the DOM structure is modified, it might require a new layout. The engines try to avoid the re-layout by only processing the affected subtree, if possible.

4. Painting the render tree

This last step is also the most complex and computationally expensive. It requires several optimizations and relies on the GPU when possible. In this step, we have two new conceptual trees, the render layer tree and the graphics layer tree. The relationship of the nodes in each tree is basically:

DOM Element > Render Object > Render Layer> Graphics Layer.

Render layers. exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc. A render layer contains one or more render object .

Graphics layers. uses the GPU for painting its content. One can visualize layers by turning on the “Show composited layer borders” in Chrome DevTools (it’s under the Rendering Tab, which is only made visible by clicking on the drawer icon >_). By default, everything is rendered in a single layer, but things like 3D CSS transforms trigger the creation of new layers. Wiltzius [2] provides a sample page where one can visualize an extra layer:

http://www.html5rocks.com/en/tutorials/speed/layers/onelayer.html

A render layer either has its own layer, or inherits one from its parent. A render layer with its own layer is called compositing layer.

Rendering process. occurs in two phases: painting, which consists of filling the contents of a graphics layer and compositing (or drawing) which consists in combining graphics layers into a single image to display in the screen.

Conclusion

I was initially planning to study general JavaScript performance profiling. In researching articles on the internet, I’ve found a number of them are related to making websites more responsive by understanding and optimizing the browser rendering process. I’ve realized there was a lot to be learned about this process and I could benefit from studying this subject.

A lot of the articles come from different sources, but a few authors seem to always been involved in them. Paul Irish and Paul Lewis are two of the folks who I’ve seen in several articles (see Additional Resources), and they have a strong presence online and might be worth following them if you’re interested in the subject.

References

[1] HTML5 – How Browsers Work: Behind the scenes of modern web browsers
[2] HTML5 Rocks – Accelerated Rendering in Chrome
[3] Chromium Design Documents – GPU Accelerated Compositing in Chrome
[4] High Performance Web Sites: Essential Knowledge for Front-End Engineers

Additional Resources

Chrome Developer Tools (for rendering):

* Speed Up JavaScript Execution
* Analyze Runtime Performance
* Diagnose Forced Synchronous Layouts
* Profiling Long Paint Times with DevTools’ Continuous Painting Mode

Google Web Fundamentals:

* Stick to compositor-only properties and manage layer count
* Avoid large, complex layouts and layout thrashing

Browser rendering performance:

* CSS Triggers – To determine whether a CSS property triggers layout
* Accelerated Rendering in Chrome
* The Runtime Performance Checklist
* How (not) to trigger a layout in WebKit

General JavaScript performance (mainly for V8):

* Performance Tips for JavaScript in V8
* IRHydra2 – Displays intermediate representations used by V8