# Browser Performance

In this post we’ll go over the topic of browser performance. We’ll cover tools and techniques to identify, measure and improve performance on the client side of web applications.

Assumptions. Our tests use the Chrome browser, which means the tools likely only make sense for this specific browser (general techniques are agnostic). All the screenshots are taken either on a Linux machine running Gnome or the latest MacOS, though it should be relatively similar to Chrome in other platforms. The Chrome version used is v80.

## Analyzing Performance

### FPS Meter

This widget tracks the number of frames per second. To enable it,

In the … icon menu, click on Show console drawer:

In the Rendering tab, check the box FPS meter:

And it should looks like this:

The yellow bar represents 60 FPS, which is the threshold under which the page might look sluggish. The bar chart on the right is a histogram view of the visible line chart. The bars at the bottom represent low FPS and on the top high FPS. The number on the top are the range (max/min FPS).

Initially the numbers will be low because the is not previous data, since it only measure FPS when the page re-renders. To prime the data you need to manually trigger many visual updates, say resizing the window or specific actions on your application:

### Flame Chart

The flame chart is a very useful visualization to understand performance. It allows identifying expensive functions at a glance (long bars) and dive in into sub-functions quickly.

To use this, download and open (locally) the on_click.html example. Click on the “Performance” tab. Click the record button. Click the “Click Me” button on the page. After some number is displayed, click on Stop on the recording. We should get a chart similar to the one below:

Flame chart view

We can see the area chart at the top which is an overview of the processing happening below. The y-axis in that chart is amount of CPU.

The second chart is the flame chart. It has the bars corresponding to function calls (e.g. Event:click) and below that are the functions that were called. We see a generic Function Call because we used a anonymous arrow function like below:

Another thing to notice is that if we zoom enough, we see what looks like multiple calls of the expensive() function but in our code it is a single continuous function. I’m not sure why Chrome does that, but I’m assuming it has to do with the garbage collection kicking in.

There are other interesting alternative views at the bottom. The doughnut chart gives a breakdown on types of work being done. In our case it’s just a mix of scripting and idle:

The Call Tree view is great for the details. It allows looking at specific call chains and how much each is contributing. In our example, we can see Event: click is an expensive operation but it’s the expensive() function that is the culprit (based on the Self time):

To find specific bottlenecks the Bottom-Up view might be more interesting. If the Call Tree shows a nested tree from the root, the Bottom-up view shows the internal nodes but sorted by Self Time:

Performance Metrics

When we record a trace that includes a page load, it also includes some standard metrics like First contentful paint (FCP). The definition of some them are here [4].

It’s possible to write our own events that show up in Timings. We just need to add markers and  measure duration using the performance API [6]:

Then it shows up in the Timeline row:

The result of performance.measure() contains useful information:

### Network

When we first open the Network tab, it immediately starts recording the network activity, but since we want a full-scan of the page requests, let’s stop it, clear the data, start recording again, reload the page and then stop recording to get a report.

The sample below is from imdb.com:

Network tracing for imdb.com

Similar to the performance tab, this view provides a timeline which we can zoom in. The table below lists all network requests performed by the browser, including static resources like images, JavaScript files and async requests to fetch more data.

We can filter by request type at the top:

Here we could look for oversized images (though unlikely you’ll need to show a gigantic high-resolution image by default – so look into sending a thumbnail with a link to the larger image) or slow async requests.

Async requests are of type XHR (XmlHttpRequest), so we can filter by that:

the bars on the last column give an indication of not only the time a request took, but the dependencies. Sometimes an async request doesn’t start until a specific one finishes. If we hover over the bar of the render request, we see:

It has a nice breakdown on the timings, including DNS lookup and SSL latency (authentication). It also includes queuing time, the amount of time waiting TTFB, and then the content download.

The queuing indicates that the request was ready to go, but needed to wait on some other dependency or due to browser limits. Usually there’s a limit of active XHRs per domain the browser can have (<10) [7].

TTFB means “Time to first byte” – this means that the time it takes to start receiving data. This is indicative of the time the server took processing.

Content download is the time spend getting the resource/data. It’s correlated with the size of the data, network bandwidth/speed.

## Measuring Performance

### Duration from start

The performance API keeps track of specific events in the page loading lifetime, so we can measure durations compared to a specific point in time.

performance.now() [9] provides the timestamp since the page started to load (in rough terms – see this more precise definition).

Side note: browsers purposefully reduce the precision of these APIs to protect against attacks that rely on high-precision timing information [10].

### Custom Durations

As we mentioned in an earlier in measurement.js, we can measure durations. By using performance.mark() and performance.measure() [5].

Benchmarking

We can use the APIs above to create simple benchmarks.

Before we start: always beware of over-optimizing code! You might end up spending a lot of time to improve performance against a specific benchmark, browser or even JavaScript engine version. That’s part of the reason V8 retired the benchmark tool Octane.

That said, if you identify some code in the critical path that executes the same operation on the order of millions of times, it might be worth optimizing it. One example is changing from array methods like forEach() and reduce() to pure for-loops. We can perform several runs of the same function and get the average. I opted for using a random number generator to avoid any optimization related to caching.

The function generated an array with 10 million random floats and calculates the sum. It then runs the code 10 times. The results follow:

• for loop 12ms
• reduce 156ms
• forEach: 208ms

On Firefox it yields

• for loop 40ms
• reduce 48ms
• forEach 64ms

We can see a for loop is ~92% and ~16% faster than reduce on Chrome and Firefox respectively. Some takeaways:

• It’s highly browser-dependent! Don’t over-optimize code for one browser.
• Although the relative gains are substantial, the absolute improvements are only 100ms for an array of 10 million items – how practical is this? The creation and allocation of the array probably takes more time than this.
• This is pure computation, so no DOM access is needed, so consider using Web Workers

## Techniques

Here we discuss some techniques to improve performance. The first two are general purpose optimizations but we include here for completeness.

### Reduce Algorithmic Complexity

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Before optimizing code, we should consider optimizing algorithm. This means reducing the big-O complexity. In a recent project, we found several wins by identifying hidden O(n^2) code and converted them to O(n).

We often need to make this tradeoff: to improve speed of code (computation) we have to use more memory. This is actually how we improved the O(n^2) to O(n): by keeping a look-up hash table for fast access.

In JavaScript a quick look-up table can be implemented via Object, or ES6 Map and Set. Other techniques involve caching (memoization).

In [8] Addy Osmani and Mathias Bynens suggest that if you’re sending large JSON payload via the network, it’s better to send them JSON-encoded instead of JavaScript objects literals:

Because the JSON grammar is much simpler than JavaScript’s grammar, JSON can be parsed more efficiently than JavaScript

### Reduce CSS selector complexity

In [17], Lewis suggests reducing the complexity of CSS selectors such as

to

The reason being:

Roughly 50% of the time used to calculate the computed style for an element is used to match selectors (…)

### Use Flexbox

In [18] Lewis suggests that flexbox is more efficient than other layouting alternatives such as float positioning. One caveat made in the article is that is node widely supported. As of 2020 however, that doesn’t seem the case anymore.

### Avoid reading DOM properties right after mutating them

In [18] Lewis explains that the Layout phase happens after the JavaScript runs:

This means that changing a DOM property and reading it afterwards, requires the browser to block and perform the layout before reading the new value. If possible, make the read before the write.

### Avoid some advanced CSS effects

This is more of a tradeoff between performance and design. In [19] Lewis mentions that properties such as box-shadow can be expensive.

It’s unclear however which specific CSS properties are expensive.

### Use Layers (but sparingly)

In [19] Lewis suggests creating layers for elements that are updated very often, to avoid these updates from causing updates in the rest of the page. One way to force the creation of a layer is:

These should only be created if there’re clear improvements in performance, since like indexes in databases, they come with memory and management cost.

### Use requestAnimationFrame()

Since JavaScript’s main thread is used for rendering, running expensive function might cause the UI to freeze. In order to prioritize the rendering, we can schedule expensive functions to run we can use requestAnimationFrame(). According to [3]:

When visual changes are happening on screen you want to do your work at the right time for the browser, which is right at the start of the frame. The only way to guarantee that your JavaScript will run at the start of a frame is to use requestAnimationFrame

In [20], Lewis provides a more specific case for the use of requestAnimationFrame(): on scroll input handlers since they block the UI.

There’s one caveat however: Input handlers execute before requestAnimationFrame(), so if it makes style changes and then we read it in requestAnimationFrame, we run into the problem described in Avoid reading DOM properties right after mutating them.

### Web Workers

As we discussed previously, web workers run in a separate thread, which prevents your application from freezing because the main thread is blocked. To understand why this happens, it’s important to know the basics of how the JavaScript execution model works.

The major caveat is that web workers don’t have access to the DOM but they do have I/O access (like XHR).

### Web Assembly

It’s possible to use lower level languages like C++ and Rust and run them on the browser by compiling them to an intermediate representation called Web Assembly (WASM).

This guide explains how to compile Rust code to Web Assembly, package it as an npm package and execute it on the browser via Webpack. I’ve tried myself and the process is relatively straightforward.

Portability. Web Assembly’s official website [14] claims it is portable across different platforms.

Existing libraries. One of the major selling points of Web Assembly is being able to run existing libraries written in languages like C/C++. In [11] PSPDFKit’s team showcase their port of a PDF renderer that previously had to be done on the server and now can be done by shipping the binary as wasm. They perform some benchmark, getting mixed results in terms of performance but had several follow-up with browser vendors and were able to improve their performance (though no subsequent analysis was available in the post).

Web Assembly’s website [13] lists other use cases.

Performance. The main benefit I can see for starting a project from scratch in say, C++ or Rust, would be performance over JavaScript. JS vs WASM’s site [12] provides a set of computations in JavaScript and wasm compiled from C++. The results are not promising: in most cases Javascript is faster.

### GPU via WebGL

WebGL (Web Graphics Library) is a JavaScript library available in most browsers, with the main purpose of rendering 2D and 3D graphics. But because it has access to the GPU (JavaScript doesn’t have direct access), it’s can be used for heavy but highly parallelizable computations such as Machine Learning [16].

If your application requires this sort of processing, WebGL can be an interesting idea.

## Conclusion

This was another of the notes on Web Development. The other notes are listed in the Related Posts section below.

In this post we covered a wide variety of tools and techniques regarding performance of client-side applications. We didn’t delve into much detail, but hopefully some of the links will serve as starting pointers for further research.

For me it was particularly interesting to learn more about Web Assembly and I’m interested in studying it further, especially with Rust.

## References

[1] Get Started With Analyzing Runtime Performance
[2] Rendering Performance
[3] Optimize JavaScript Execution
[4] User-centric performance metrics
[5] User Timing API
[6] StackOverflow: Using performance.mark() with Chrome dev tools performance tab
[7] StackOverflow: How many concurrent AJAX (XmlHttpRequest) requests are allowed in popular browsers?
[8] The cost of parsing JSON
[9] MDN: performance.now()
[10] Github w3c/hr-time: Reducing the precision of the DOMHighResTimeStamp resolution
[11] A Real-World WebAssembly Benchmark
[12] JS vs WASM
[13] WebAssembly: Use Cases
[14] WebAssembly: Portability
[15] Donald Knuth: Structured Programming with go to Statements
[16] TensorFlow.js: Machine Learning for the Web and Beyond
[17] Reduce the Scope and Complexity of Style Calculations
[18] Avoid Large, Complex Layouts and Layout Thrashing
[19] Simplify Paint Complexity and Reduce Paint Areas

## Related Posts

• Notes on JavaScript Interpreters – This post focuses on the detail of Chrome’s V8 interpreter. The relationship with performance is that understanding how JavaScript code is executed (e.g. event loop model, JIT) can give us insight on how to optimize it.
• Web workers – is a browser feature that allows running code outside of the main thread. This can speed up computation as long as no DOM access is needed.
• Notes on how browsers work – provides a basic overview on the stages of executing a web page, from downloading HTML and Javascript, executing the scripts, constructing the DOM tree and laying out the page. Understanding these steps might give insights on how to better optimize the app.
• Notes on Javascript Memory Profiling in Google Chrome – memory is related to performance because memory and CPU are the major bounded resources your app has to deal with. Not taking care of memory (e.g. memory leaks) could also exhaust the memory available for the execution engine to make the CPU-Memory tradeoffs we mentioned above.

# Observable

Observable is a web-first interactive notebook for data analysis, visualization, and exploration [1].

It was created by Mike Bostock, the creator of d3.js, which we discussed previously way back in 2014 [2]. At some point, Mike stepped down from d3.js to focus on a new library called d3.express, which he presented in 2017 during the OpenVis conference [3] and was still in the makes. d3.express eventually got renamed to Observable.

I’ve been experimenting with Observable and in this post I’ll cover some of my learnings while trying to build two visualizations.

One of the benefits of notebooks like Observable is to co-locate code with its output. This makes it much easier to explain code because in addition to markdown comments, we can modify the code and see the results in real-time. This means that a lot of the documentation of APIs available in Observable are notebooks themselves. For these reasons, this post consists of a series of links to notebooks with high-level comments.

Before we start, some assumptions I’m making are familiarity with imperative programming and the basics of JavaScript.

## What is Observable?

Why Observable – this notebook explains the motivation behind Observable and how it works at a high level:

An Observable notebook consists of reactive cells, each defining a piece of state. Rather than track dependencies between mutable values in your head, the runtime automatically extracts dependencies via static analysis and efficiently re-evaluates cells whenever things change, like a spreadsheet

Five-Minute Introduction – provides a hands-on approach of the many topics, some of which we’ll cover in more detail in this post:

• Real-time evaluation of JavaScript snippets
• Cell references
• Markdown
• HTML/DOM display
• Promises
• Generators
• Views
• Importing from other notebooks

Let’s start by delving into the details of JavaScript snippets and cell references.

## Observable vs. JavaScript

Observable’s not JavaScript – in this notebook Bostock explains the differences between JavaScript and Observable. Some notes I found interesting from that notebook:

• A cell is composed by a cell name and an expression:
• [cell name] = [expression]

It can be simple statements like 1 + 2, or multi-line statements like

• The value of  cell_name can be used in other cells, but by default is read-only.
• Each cell can only have one  cell_name  assignment. In other words, it can only “export” one variable. It’s possible to cheat by exporting an object. Note that because curly brackets are used to denote code blocks, we need to wrap an object literal with parenthesis:

• Cells can refer to other cells in any part of the code – Observable builds a dependency DAG to figure out the right order. This also means dependencies cannot be circular. How Observable Runs explains this in more details.
• Constructs like async functions (await) and generators (yield) have special behavior in a cell. We’ll expand in the Generators section below.
• Cells can be mutated if declared with a qualifier (mutable). We’ll expand in the Mutability section below.

## Mutability

Introduction to Mutable State – cells are read-only by default but Observable allows changing the value of a cell by declaring it mutable. When changing the value of the cell elsewhere, the mutable keyword also must be used.

It’s important to note that the cell is immutable, but if the cell is a reference to a value, say an Object, then the value can be mutated. This can lead to unexpected behavior, so  I created a notebook, Mutating references, to highlight this issue.

## Markdown

Markdown summary – is a great cheat sheet of Markdown syntax in Observable. I find the Markdown syntax in Observable verbose which is not ideal given text is so prevalent in notebooks:

mdHello World 
More cumbersome still is typing inline code. Because it uses backticks, It has to be escaped:

mdHello \code\

To be fair, most code will be typed as cells, so this shouldn’t be too common. It supports LaTeX via KaTeX which is awesome. KaTeX on Observable contains a bunch of examples.

## Generators

Generators are a JavaScript construct that allows a function to yield the execution back to the caller and resume from where it stopped when it’s called again. We talked about this in the Async Functions in JavaScript post.

Introduction to Generators explains how generators can be used in Observable. The combination of generators and delays (via promises) is a great mechanism to implement a ticker, which is the base of animation. Here’s a very simple way to define a ticker in Observable (remember that Promises are awaited without extra syntax):

## Views

Introduction to Views explains what views are and how to construct one from scratch. I like this recommendation:

If there is some value that you would like the user to control in your notebook, then usually the right way to represent that value is as a view.

Views are thus convenient ways to expose parameters via buttons and knobs such as text inputs, dropdowns and checkboxes.

I ran into a problem when using checkbox with labels in it, like the cell below:

It does not yield true/false as I wanted. This notebook, Checkbox, discusses the problem and offers a solution. This is an example where great but imperfect abstractions make it hard to understand when something doesn’t work. It’s worth learning about how viewof is implemented behind the scenes.

In one of my experiments I needed to synchronize a view and a ticker. Bostock provides a neat abstraction in his Synchronized Views notebook.

## Importing from other notebooks

Introduction to Imports shows how easy it is to import cells from other notebooks.

It also shows that it’s possible to override the value of some of the cells in that notebook:

This is neat, because the chart cell depends on the data, and by allowing overriding data, it allows parameterizing chart. In other words, we import chart as a function as opposed to a value (the result of chart(data)).

Versioning. When importing a notebook we always import the most recent version, which means it could potentially break if the notebook’s contract changes. The introduction above mentions the possibility of specifying the version when importing but didn’t provide an example on how to do so.

My strategy so far has been to clone a notebook if I’m really concerned about it changing, but that means one has to manually update / re-fork if they want to sync with the upstream changes.

## React Hooks

I’m used to write UI components with React and luckily Jed Fox created a notebook which I forked into this React Notebook which allowed me to use React Hooks to write a view.

## Conclusion

I really like Observable and am looking forward to add implement more data visualizations in it. I experimented with 2 animations so far: a simplistic Solar System model and a spinning Globe.

One natural question that crossed my mind is whether I should have written this whole post as an Observable notebook. In the end I decided to stick with an old-school static post for two reasons:

• Consistency: Observable only supports JavaScript, so if I’m writing a post about Python I’d need to fallback to a post, and I wouldn’t want to have a mix of both.
• Durability: Observable has a lot of potential but it’s still mostly experimental and I’m not sure I can rely on it sticking around for a long time.

## References

• Introduction
• Mutability
• Markdown
• Views
•  Importing
• Integration

### Other

[1] Observable – Why Observable?
[2] Introduction to d3.js
[3] OpenViz Conf – 2017
[4] Async Functions in JavaScript

# Async Functions in JavaScript

Async functions are a JavaScript construct introduced in the ES2017 spec. Put it simply, async functions allow writing asynchronous code using synchronous syntax.

In this post we’ll discuss async functions in JavaScript, covering some other concepts such as iterators, generators that can be used to implement async functions in cases they’re not supported by the runtime.

### A Glimpse

Before we start delving into the details, let’s get a sense on why async functions are useful, especially in the context of Promises.

In case you don’t know about Promises in JavaScript or need a refresher, I recommend checking that out first. If you’d like, we talked about Promises in a previous post which might serve as an introduction.

As we’ve learned, they’re very handy for reducing the so called “callback hell”. Here’s a contrived example prior to Promises:

We saw that with Promises we can simplify this to:

Which is much cleaner. Now, the async syntax allows an even cleaner code by making it look synchronous:

We can now proceed into some details on how async functions work. To start, let’s learn about intermediate concepts, namely iterators and generators.

### JavaScript Iterators

An iterator is basically an object that has a method next(), which returns an object with fields value and done, the latter being a boolean indicating whether there’s a next value to be iterated on. An iterator is more like a design or code pattern: it’s not explicitly supported by the JavaScript runtime in any special way.

Iterable on the other hand is a contract that an object can be used as iterator. To indicate this we add a special field containing Symbol.iterator, which maps to a function that returns this object (similar to an interface in an OOP language) – and this constructed is handled as a special case.

In the example below we create an example iterator and use it with the for-of construct:

### JavaScript Generators

Generators are a syntax sugar for iterators in what it allows us to not keep track of a “global” state (in the example above via this.cnt). It does so by allowing the function to yield the execution back to the caller and resume from where it stopped when it’s called again. Behind the scenes, it creates an object with the same structure as the iterator object we defined above, namely with a next() method. It’s much clearer with an example:

First, we indicate the function is a generator with the * modifier (i.e. function*). Here we don’t have to explicitly define the next() function and we don’t need to keep track of the variable cnt outside of the function – it will be resumed from the state it had when we called yield.

As with iterators, we can make generators iterable by implementing a contract. In this case we create an object with a special field containing Symbol.iterator which maps to the generator function:

### Async Functions <> Promises

We’re now ready to come back to async functions. We can think of async functions as syntax sugar for Promises. Suppose a function f() exists that returns a Promise. If we want to use the result of that Promise and return a new one, we could do, for example:

Instead, we could replace g() with an async function, which “understands” Promises and returns them, making it possible to easily mix with Promise code. The code above would look like:

Note how we swapped a Promise-based implementation with an async one without making any changes to the call stack that expected Promises throughout.

Handling errors. Async functions have a familiar syntax for error handling too. Suppose our function f() rejects with some probability:

If we are to replace g() with an async version, using the try/catch syntax:

### Async Functions as Generators

As of this writing most major browsers support async functions on their latest versions except Internet Explorer. For a while though, if developers wanted to use async functions they needed to rely on transpilation (i.e. translate their async-based code into browser-compatible code). One of the most popular tools for this is Babel, which transpiles code with async functions into one using generators and some helpers.

We can study that code to learn how to implement async-like functions using generators. Consider this simple example chaining two Promises using an async function.

If we translate it using Babel we get some generated code. I removed parts dealing with error handling and inlined some definitions to make it easier to read. Here’s the result:

Let’s see what is happening here. First, we note that our async function got translated into a generator, basically replacing the await with yield. Then it’s transformed somehow via the _asyncToGenerator() function.

In _asyncToGenerator() we’re basically invoking the generator recursively (via gen.next()) and at each level we chain the Promise returned by a yield call with the result of the recursion. Finally we wrap it in a Promise which is what the async function does implicitly.

Intuition. Let’s try to gain an intuition on what’s happening here on a high level. The ability of resuming execution of a function at particular points (via yield in this case) is what enables us to avoid passing callbacks every where. Part of why we need pass the callback is that we need to carry the “code” around as a callback, but by having the run time keep the code around solves this problem. For example, in a Promise world, code 1 and code 2 are wrapped in the arrow functions:

In a world where we can remember where we were when an async execution happened, we can in-line the code:

This translation relies on the existence of generators being fully supported by the runtime. In a world where generators didn’t exist as first class citizens, how could we implement them via helpers and also transpilation? We could probably use some sort of iterators and switches to simulate resuming execution at specific points in code, but this is out of the scope of this post and left as food for thought.

### Conclusion

In this post we learned about some more language features that help with code authoring and readability, namely generators and async functions. These are very useful abstractions that ends up being added to programming languages such as Python, C#, and Hack.

# Eulerian Circuits

Leonhard Euler was a Swiss mathematician in the 18th century. His paper on a problem known as the Seven Bridges of Königsberg is regarded as the first in the history in Graph Theory.

The history goes that in the city of Königsberg, in Prussia, there were seven bridges connecting different mass of lands along the Pregel river (see Figure 1). The challenge was to find a path through the city that crossed the bridges exactly once. Euler showed that no such solution existed.

Interesting unrelated fact: Today Königsberg called Kaliningrad in Russia, and Kaliningrad is actually separated from Russia geographically, lying between Lithuania and Poland.

Figure 1: Map of Königsberg and the seven bridges. Source: Wikipedia

The solution to the Seven Bridges of Königsberg problem eventually led to a branch of Mathematics known as Graph Theory. In this post we’ll be talking about the theoretical framework that can be used to solve problems like the Seven Bridges of Königsberg, which is known as Eulerian Circuits.

We’ll provide a general definition to the problem, discuss a solution and implementation, and finally present some extensions and variations to the problem.

## Definition

Let G(V, E) be a connected undirected graph, where V is the set of vertices and E the set of directed edges, and where (v, w) denotes an edge between vertices v and w. The Eulerian circuit problem consists in finding a circuit that traverses every edge of this graph exactly once or deciding no such circuit exists.

An Eulerian graph is a graph for which an Eulerian circuit exists.

## Solution

We’ll first focus on the problem of deciding whether a connected graph has an Eulerian circuit. We claim that an Eulerian circuit exists if and only if every vertex in the graph has an even number of edges.

We can see this is a necessary condition. Let v be a node with an odd number of edges. Any circuit traversing all edges will have to traverse v. Moreover, it will have to use one edge to “enter” v and one edge to “leave” v. Since this circuit can traverse each edge no more than one time, it will have to use different edges each time, meaning it needs 2 edges every time it crosses v. If there are an odd number of edges, one edge will be left unvisited.

To show this is sufficient, we can provide an algorithm that always finds an Eulerian circuit in a graph satisfying these conditions. Start from any vertex v and keep traversing edges, deleting them from the graph afterwards. We can’t get stuck on any vertex besides v, because whenever we enter an edge there must be an exit edge since every node has an even number of edges. Thus eventually we’ll come back to v, and this path form a circuit.

This circuit doesn’t necessarily cover all the edges in the graph though, nor it means that are other circuits starting from v in the remaining graph. It must be however, that some node w in the circuit we just found has another circuit starting from it. We can repeat the search for every such node and we’ll always find another sub-circuit (this is a recursive procedure, and we might find sub-sub-circuits). Note that after we remove the edges from a circuit, the resulting graph might be disconnected, but each individual component is still Eulerian.

Once we have all the circuits, we can assemble them into a single circuit by starting the circuit from v. When we encounter a node w that has a sub-circuit, we take a “detour” though that sub-circuit which will lead us back to w, and we can continue on the main circuit.

## Implementation

We’ll use the algorithm first described by Hierholzer to efficiently solve the Eulerian circuit problem, based on the proof sketched in the previous session.

The basic idea is that given a graph and a starting vertex v, we traverse edges until we find a circuit. As we’re traversing the edges, we delete them from the graph.

Once we have the circuit, we traverse it once more to look for any vertices that still have edges, which means these vertices will have sub-circuits. For each of these vertices we merge the sub-circuit into the main one. Assume the main circuit is given by a list  of vertices $(v, p_2, ... , p_k-1, w, p_k+1, ..., p_n-1, v)$ and w is a vertex with a sub-circuit. Let $(w, q_1, ..., q_m-1, w)$ be the sub-circuit starting from w. We can construct a new circuit $(v, p_2, ..., p_k-1, w, q_1, ..., q_m-1, w, p_k+1, ..., p_n-1, v)$.

Let’s look at a specific implementation using JavaScript (with Flow). The core of the algorithm implements the ideas discussed above:

The complete code is on Github.

## Analysis

We’ll now demonstrate that the algorithm described above runs in linear time of the size of the edges (i.e. O(|E|)).

Note that find_circuit() is a recursive function, but we claim that the number of times the while() loop executes across all function calls is bounded by the number of edges. The key is in the function:

graph.getNextEdgeForVertex(vertex);

graph is a convenience abstraction to an adjacency list, where for each vertex we keep a pointer to the last edge visited. Because of this,  getNextEdgeForVertex() will visit each edge of the graph at most once and we never “go back”. Since the graph object is shared across all function calls (global), we can see that the number of calls to getNextEdgeForVertex() is bounded by O(|E|), so is the number of times all while() loops execute.

Now we just need to prove that every other operation in the while loop is O(1). The only non-obvious one is:

graph.deleteEdge(edge);

This is a lazy deletion, meaning that we just set a flag in edge saying it’s deleted and it will later be taken into account by callers like graph.getNextEdgeForVertex() and graph.getDegree(). Hence, this is an O(1) operation.

For getNextEdgeForVertex(), we must skip edges that have been deleted, so we might need to iterate over a few edges before we find an undeleted one (or none if the graph is not Eulerian – in which case we terminate the algorithm). Since we’re still always processing at least one edge in every call to getNextEdgeForVertex() the argument about the total calls being bounded by O(|E|) holds.

In order for getDegree() to be an O(1) operation, we need to keep a non-lazy count of the degree of a vertex, but we can do it in O(1) when deleting an edge.

Finally, let’s analyze the second loop. The number of iterations is proportional to the length of the circuit. Since every possible circuit found (including the ones found recursively) are disjoint, the total number of times we loop over the vertices from circuits (across all function calls) is also bounded by the number of edges.

We already saw getDegree() is O(1) even with lazy deletion. The remaining operation is

path.insertAtVertex(vertex, subPath);

if we store the paths as a linked list of vertices, inserting subPath at a given node can be done in O(1) if we keep a reference from each vertex to its last (any actually) occurrence in the path.

## Directed Graphs

We can extend the definition of Eulerian graphs to directed graphs. Let G(V, A) be a strongly connected graph, where V is the set of vertices and A the set of directed edges, and where (v, w) indicate a directed edge from v to w. The Eulerian circuit problem for a directed graph consists in finding a directed circuit that traverses every edge of this graph exactly once or deciding no such circuit exists.

It’s possible to show that such a circuit exists if and only if the strongly connected directed graph has, for each vertex v, the same in-degree and out-degree. The algorithm is essentially the same.

## Counting Eulerian Circuits in directed graphs

It’s possible to count the number of different Eulerian circuits in a directed graph. According to the BEST theorem (named after de Bruijn, van Aardenne-Ehrenfest, Smith and Tutte) [3], the number of Eulerian circuits in a directed graph can be given by [4]:

$ec(G) = t_w(G) \prod_{v \in V}(deg(v) - 1)!$ (1)

Where deg(v) represents the in-degree (or out-degree) or a vertex v and t_w(G) is the number of arborescences rooted in a vertex w (simply put, an arborescence is analogous to a spanning tree for a directed graph – but we can only include edges that are directed away from the root).

It’s possible to show that t_w(G) is the same for any vertex w if G is Eulerian. We can compute t_w(G) via the Matrix-Tree theorem [2], which says t_w(G) is equal to the determinant of the Laplacian of G without vertex w. Let’s try to understand the idea behind this equation.

The mapping from an arborescence to an Eulerian path can be made by the following. Let r be the root of a possible arborescence of G. Now, let r be the reference starting point for an Eulerian path in G (note this is just for reference, since there’s no starting point in a circuit).

We say that an Eulerian path is associated with a given arborescence if for each vertex v, the last edge passing through v, say (v, v’), belongs to the arborescence. This is more clear with an example. Consider the digraph from Figure 2. Here we’ll consider the arborescences rooted in A.

Figure 2: Directed Graph

This graph has 2 possible arborescences depicted on the left in Figures 3 and 4. In Figure 3, we can see that the edge (B, D) has to be visited before (B, C) because (B, C) is in the arborescence.

Figure 3: One of the arborescences of G and a corresponding Eulerian circuit

Now, in Figure 4, because it’s (B, D) that’s in the arborescence, it has to be visited after we visit (B, C).

Figure 4: Another of the arborescence of G and a corresponding Eulerian circuit

Note that there can be more than one Eulerian path to a given arborescence. If B had more out-edges, we’d have multiple choices, since the arborescence only specifies the last edge to be taken, not the intermediate ones. More specifically, imagine B had k out-edges. Then we could traverse the first k-1 in any combination of orders, which leads to a total of (k – 1)! ways of doing so.

The same applies to all other nodes. Due to properties of Eulerian circuits, the choice of the out-edge at a given node can be seen as independent of the choice at other nodes, so the total possible Eulerian circuits corresponding to any arborescence is given by the product of the degrees from equation (1), namely:

$\prod_{v \in V}(deg(v) - 1)!$ (2)

The key property of categorizing Eulerian circuits into arborescence classes is that they’re disjoint, that is, a Eulerian circuit corresponds to exactly one arborescence. This, in conjunction with the fact that the vertices degrees in Equation (2) are from the original graph, and hence independent of a arborescence, lead us to the two independent factors in equation (1).

## Counting Eulerian Circuits in undirected graphs

Counting Eulerian circuits in undirected graphs is a much harder problem. It belongs to a complexity class known as #P-complete. This means that:

1. It belongs to the #P class, which can informally be seen as the counting version of NP problems. For example: deciding whether a given graph has an Hamiltonian circuit (path that traverses all vertices exactly once) is a problem in the NP class. Counting how many Hamiltonian circuits existing in that graph is the corresponding problem in the #P class.
2. It belongs to the #P-hard class, which means that any problem in #P can be reduced to it via a polynomial-time transformation.

Valiant proved the first condition in [5] while Brightwell and Winkler proved the second in [6] by reducing another #P-complete problem (counting Eulerian orientations) to it.

Note that a problem in the #P class is as hard as the equivalent class in NP, because we can reduce a problem in NP to #P. For example, we can decide whether a graph has an Hamiltonian circuit (NP problem) by counting the number of circuits it has (#P problem). The answer will be “yes” if it the #P version returns a number greater than 0 and “no” otherwise.

Because the problem of counting Eulerian circuits in an undirected graph being in #P, we can conclude that there’s no efficient (polynomial time) algorithm to solve it unless P = NP.

## Conclusion

In this post we covered Eulerian circuits in an informal way and provided an implementation for it in JavaScript. I spend quite some time to setup the JavaScript environment to my taste. I strongly prefer using typed JavaScript (with Flow) and using ES6 syntax. I decided to write it in JavaScript with the potential to create a step-by-step interactive tool to demonstrate how the algorithm works.

I was familiar with the concept of Eulerian circuits, but I didn’t remember the algorithms to solve it, even though I was exposed to one of them in the past. It was a good learning experience to write the code from scratch to really understand what I was doing.

This is the first time I see the #P complexity class. It’s always nice to learn about new theories when digging further on a specific topic.

## References

[1] Bioinformatics Algorithms: An Active Learning Approach – Compeau, P. and Pevzner P.
[2] Matrix-Tree Theorem for Directed Graphs – Margoliash, J.
[3] Circuits and trees in oriented linear graphs – Aardenne-Ehrenfest, van T., Bruijn, de N.G.
[4] Wikipedia – BEST Theorem
[5] The complexity of computing the permanent – L. G. Valiant
[6] Counting Eulerian circuits is #P-complete – Brightwell, G. and Winkler, P.

# Writing JavaScript using OCaml

In this post we’ll explore BuckleScript, a framework that enables developers to write JavaScript applications using OCaml syntax.

Note that BuckleScript is not a way to convert general OCaml code into JavaScript, but rather write JavaScript applications using OCaml’s syntax and type system. The end runtime and libraries are still JavaScript.

For example, if you want to work with dates, you wouldn’t look for the corresponding OCaml library, but instead include the JavaScript Date module in your OCaml code.

Let’s first compare BuckleScript with similar frameworks to understand better why it exists and when to use it. Then we’ll follow with several examples to get a better grasp of BuckleScript and how it translates to JavaScript.

It’s better to have some knowledge of OCaml, and we have written extensively about it, but it’s also easy enough to get the gist of the benefits from our simple examples.

### Comparisons

#### BuckleScript vs js_of_ocaml

js_of_ocaml is another framework that connects the OCaml and JavaScript worlds.

According to [4] both projects aim to compile OCaml code to JavaScript. The differences pointed are:

* js_of_ocaml takes low-level bytecode from OCaml compiler, BuckleScript takes the high-level rawlambda representation from OCaml compiler
* js_of_ocaml focuses more on existing OCaml ecosystem(opam) while BuckleScript’s major goal is to target npm

A simplistic way to see the differences is that BuckleScript is for JavaScript developers to start using better language features from OCaml, while js_of_ocaml is for OCaml developers to be able to run their code in the browser.

#### BuckleScript vs ReasonML

ReasonML is often mentioned together with BuckleScript, which makes it a bit confusing to understand their differences at first.

Compared do OCaml, ReasonML has a friendlier syntax (for people coming from JS) and better support for JSX (inline XML tags). The difference in syntax is significant that we are really talking about some dialect of OCaml.

Note that BuckleScript and ReasonML are complementary. BuckleScript can compile either OCaml or ReasonML to JavaScript. ReasonML in mostly about the syntax.

#### BuckleScript vs TypeScript

Typescript is a framework for adding types to a JavaScript codebase, similar to Flow.

At first glance, TypeScript and BuckleScript seem to serve different purpose, but one of the main advantages of using OCaml to write JavaScript applications is to provide type safety.

In [2], Hongbo provides a comparison between the two systems. Some of the pros and cons raised are:

TypeScript

•  Pros:
• Designed for JS, easy inter-operate with JS
• Cons:
• Compiler slow (not scalable for FB/Google scale)
• Verbose, very limited type inference
• Start up time slow (very hard to build traditional build tools)
• Types are only used for tooling – soundness is not the design goal, not very reliable
• No code optimizations

BuckleScript

• Pros:
• Compiles much faster, scales better
• Sound type system, global type inference
• Types are used in code optimization, optimizing compiler
• Outputs to JavaScript but can also generate native backend for X86, ARM
• Cons:
• Learning curve is higher compared with TypeScript

The author also suggests that OCaml is a better language than JavaScript, without providing more details.

### Setup

The easiest way to try out examples is through this BuckleScript playground.

To try it locally, we can follow [5]. It generates an initial bsconfig.json file and the compilation process can be done via npm. The process involve converting a OCaml file (.ml) to a JavaScript file (.bs.js). The latter should be included in your application, but both should be considered part of your codebase and hence committed.

### Examples

Because most people learning BuckleScript are familiar with JavaScript and less familiar with OCaml, I’ll provide the reverse examples: how to do implement a given JavaScript snippet of code in OCaml?

console.log(‘Hello World’)

How to print to stdout using BucketScript? The most basic program is simply printing a Hello World. In OCaml we would do “Print.printf” to print to stdout, but OCaml modules are not readily available for BuckleScript. We can use the Js.log() function for this purpose:

This maps to:

Note that Js log is a library provided by BuckleScript to make the integration with JavaScript more seamless.

Looping over an Array

We can use an imperative-style code in OCaml:

Note that arrays have a different syntax than in JavaScript: [| 10; 20; 30 |]. If we generate this code we get:

The interesting thing here is that we do have access to some basic OCaml libraries, for example Array.

Since we are dealing with a functional language, we might as well use some more idiomatic OCaml code. If we are to translate the code to native JavaScript structures, we have to use functions from the Js module:

Which maps to

Looping over a List

So Array is the basic structure to represent a sequence of items in JavaScript but in OCaml it’s List. What happens if we use List instead?

We simply dropped the | to use a List instead of Array and now we can use the more standard fold_left instead of reduce. This translates to:

This is very interesting! We studied functional data structures OCaml extensively in the past and the key concept is that by default data structures are persistent. If we look closely what is being passed to sum(), we see it’s a linked-list like structure: [10, [20, [30, 0]]], the 0 being used to indicate the end.

Modules

Another common pattern in JavaScript is to group a bunch of functions inside an Object. The Object name serves as namespace or a module name that can be referenced elsewhere. A natural mapping here are the OCaml modules:

which maps to:

Note that the signature is only used for compilation/transpilation checks and is erased from the final JavaScript code. Another curious thing is that the functions are exported as arrays to MyModule. To me it would make more sense to export them as Object.

Currying Functions

A nice feature from functional languages is the concept of currying, which allow us to perform partial application of functions. For example, if we have a function that adds two numbers, we can derive an increment function that partially applies sum by binding the first value to 1:

The resulting JavaScript code is:

Note that we can already perform partial application of function in vanilla JavaScript via bind(), but the syntax is not as neat:

Chaining Functions

Another neat syntax from OCaml is the chaining operator. One can chain functions via the |> operator: the result of the lefthand function is passed as argument to the righthand function.

A common use for this is when we want to apply a series of function in sequence without assigning to a variable.

For example, if we have a function to normalize a given string by converting it to lower case, trimming the leading and trailing whitespaces and also converting intermediate spaces to underscores, we could write, in JavaScript:

An alternative would be to nest the calls so we don’t have to repeat the variable, but that would hurt legibility. In OCaml, we could chain these calls:

Note the [%bs.re] tag. It is a macro that allows us to write regexes using JavaScript syntax. We can avoid repeating the module names if they are all the same:

Using JavaScript libraries

One of the main selling points of BuckleScript is that you can adopt it gradually, module by module. This is possible because we can require JavaScript modules inside our OCaml code. For example, if we were to convert the following code that reads a file asynchronously in JavaScript:

We could do:

Here, the OCaml code is more verbose but we provided a stronger contract by typing the function readFile(). The syntax for importing modules is

Note: if is the same as , the latter can be omitted.

Objects as Maps

In JavaScript Objects are often used either as maps (key-value) or records (entries of distinct fields). In OCaml we can rely on types to enforce the specific use we want via types. In the example below, we declare a map with type string to int. If we try to set a value with a different type we get a compilation error:

Objects as Records

To represent an Object as a record, we can use a OCaml record type syntax:

We added the [@bs.optional] to indicate that a field is optional. We also added the [@@bs.deriving abstract] attribute to indicate it should not be directly instantiated like

Instead, it generates a “constructor” function. In the same way, the properties of a record are not directly available. They also need to go through intermediate auto-generated accessors:

The generated JavaScript code translates to an Object:

The interesting thing is that the generated JavaScript Object is mutable, but within the OCaml code, the record cannot be modified. It’s possible to mark it mutable, but the default immutability makes it easier to reason about code.

### Downsides

The benefits being stated, there are two main potential drawbacks of using BuckleScript.

Mixed languages. Adopting BuckleScript will cause the codebase to have a mix of different languages, which makes it harder for new developers to ramp up. Of course this can be mitigated by converting the entire codebase to use OCaml.

Debugging. We’ll be writing code in on language but it’s another language that will end up being executed. If a problem happens in the underlying JavaScript code, how to figure out which OCaml code is generating the faulty code?

BuckleScript tries to solve this issue by preserving the structure of the code as much as possible so that it’s easier to understand what parts maps to what. This works well if we are using the Js wrappers that resembles the JavaScript code patterns, but it’s unclear how easy the structure is preserved if we use more of OCaml persistent data structures or functional programming patterns like currying.

One possible improvement would be to add some traceability to the generated JavaScript code such that you won’t need to look at the JavaScript code most of the time, in the same way that one doesn’t usually need to inspect Assembly code when their C++ application crashes.

### Conclusion

In this post we did a comparison of BuckleScript with different frameworks and libraries to understand why and when to use it. Following that, we studied a few basic examples which one might encounter in a day-to-day JavaScript programming and how to express that in OCaml.

Through these examples, we saw the OCaml type system in use, as well as some neat syntax and immutable data structures, which can lead to more readable, succinct and reliable code.

### References

[1] BuckleScript Examples
[2] BuckleScript: An OCaml to JavaScript compiler (Hacker News discussion)
[3] BuckleScript playground
[4] BuckleScript/bucklescript – Comparisons
[5] BuckleScript – New Project

# Monitoring System using Raspberry Pi

A few months ago, I described how to use a Raspberry Pi to setup a domestic server. In this post we’ll see how to take snapshots using the Pi and a camera, so combining it with the server, we can create a simple monitoring system accessible from a phone.

### Node.js server

First thing we have to do is to run another server, separate from nginx to handle the requests. nginx is a very basic server that can’t do much else, but it’s capable of forwarding requests/responses.

I chose Node.js because I’m familiar with JavaScript and haven’t used much Node.js, such it was a good learning opportunity.

To install it, first we need to find out which architecture our Pi’s processor is. The Raspbian has the command arch which tells returned armv6l in my case. This tells us which version to pick from the Node.js site. We can install it manually by downloading and unpacking it:

cd ~/Downloads
wget https://nodejs.org/dist/v6.9.2/node-v6.9.2-linux-armv6l.tar.xz
tar xf node-v6.9.2-linux-armv6l.tar.xz


We can put it in a local directory (no need for sudo). We move it to a convenient place:

mv ~/Downloads/node-v6.9.2-linux-armv6l workspace/node


and then make sure that path is being looked for when running executables:

export PATH="$HOME/workspace/node/bin:$PATH"

Now we need to tell nginx to forward requests to another IP address, which our Node.js server will be listening to, by adding this to /etc/nginx/sites-available/default:

Finally we create a very simple Node.js application that listens to localhost (127.0.0.1), port 3000, and servers an HTML file as response:

### Camera

I got the Raspberry PI 5MP Camera Board Module to take the still photos. It’s an official product from Raspberry pi, so it comes with some nice integrations.

In particular, I was interested in the raspistill utility.

It has a few parameters and image transformation options, and we can specify an output file to which an image will be saved in JPEG format.

raspistill -w 600 -h 400 -rot 180 -o images/image.jpg

### Putting things together

The basic idea of this system is that whenever we hit our webpage, it will take a snapshot using the camera and display the image.

Because it takes some time for a photo to be taken, we do it asynchronously. In the meantime it shows the latest snapshot it had on file.

To avoid overloading the Pi we debounce snapshot commands and only take a new snapshot after at least a minute since the previous one. I wrote a bash script to handle that:

Now we just need out Node.js app to call this script asynchronously:

### Conclusion

This simple project was inspired by a friend, who used a Raspberry Pi + camera to monitor his apartment during his vacation. I copied the idea with the intention to learn about the process and put the Pi I bought a couple of years back to use.

I haven’t actively used this system, but it was fun to work on it and I’m now looking forward on working on other “home” projects.

# Notes on JavaScript Interpreters

In a previous post, Notes on how browsers work, we studied the high-level architecture of a browser, specifically the Rendering Engine. We used the following diagram,

In this post we’ll focus on the JavaScript interpreter part. Different browsers use different implementations of the interpreters. Firefox uses SpiderMonkey, Chrome uses V8, Safari uses JavaScriptCore, Microsoft Edge uses Chakra, to name a few. V8 is also used as a standalone interpreter, most notably by Node.js.

These interpreters usually comply to one of the versions of the ECMAScript, which is a standardization effort of the JavaScript language. ECMA-262 is the document with the specification. As it happens with other languages, from their first inception, design flaws are identified, new development needs arise, so the language spec is always evolving. For that reason, there are a few versions of ECMAScript. Most browsers support the 5th edition, also known as ES5.

There’s already the 7th edition (as of 2016), but it takes time for browsers to catch up. Because of that, JavaScript programs that are capable of translating newer specs into ES5 were created, such as Babel. The advantage of using this technique, also known as transpiling, is that developers can use newer versions of the step, such as ES6, without depending on browser support. Disadvantages include the extra complexity by adding a new step in the deployment process and it makes harder to debug since it’s hard to map errors that happen in the transformed code back to the original source.

Section 8.4 of the ECMA-262 describes the execution model of JavaScript:

A Job is an abstract operation that initiates an ECMAScript computation when no other ECMAScript computation is currently in progress. A Job abstract operation may be defined to accept an arbitrary set of job parameters.

Execution of a Job can be initiated only when there is no running execution context and the execution context stack is empty. A PendingJob is a request for the future execution of a Job. A PendingJob is an internal Record whose fields are specified in Table 25. Once execution of a Job is initiated, the Job always executes to completion. No other Job may be initiated until the currently running Job completes. However, the currently running Job or external events may cause the enqueuing of additional PendingJobs that may be initiated sometime after completion of the currently running Job.

MDN’s Concurrency model and Event Loop [2] describes this spec in a more friendly way. As in other programming environments such as C and Java, we have two types of memory available: the heap and the stack. The heap is the general purpose memory and the stack is where we keep track of scopes for example when doing function calls.

In JavaScript we also have the message queue, and for each message in the queue there is an associated function to be executed.

JavaScript Execution Model

### The event loop

The execution model is also called the event loop because of the high-level working of this model:

When the stack is empty, the runtime processes the first message from the queue. While executing the corresponding function, it adds an initial scope to the stack. The function is guaranteed to execute “atomically”, that is, until it returns. The execution might cause new messages to be enqueued, for example if we call

The callback passed as the first argument to setTimeout() will be enqueued, not stacked. On the other hand, if we have a recursive function, such as a factorial:

The recursive calls should all go to the same stack, so they will be executed to completion, that is until the stack is empty. The calls provided to setTimeout() will get enqueued and be executed only after the value of the factorial gets printed.

In fact, using setTimeout() is a common trick to control the execution order of functions. For example, say that function A called B and B calls another function C. If B calls C normally, then C will have to finish before B returns, as the sample code below:

In case we want to finish executing A first, then B can call C using setTimeout(C, 0), because then it will be enqueued, and then both A and B will finish until a new message is processed from the queue, as the code below:

### Web Workers

We discussed Web Workers in a previous post, in which we mentioned that it’s a separate thread that shares no memory with the main thread. In fact, according to [2], it has its own stack, heap and queue. It doesn’t violate the execution model of the main thread because communications must be done via a publisher/subscriber API, which means communicating between two threads is subject to the queueing.

### V8

Beyond the spec, each JavaScript engine is free to implement feature the way they want. V8’s architecture is described in their wiki page [3], so we can study some of its key components.

#### Fast Property Access

In JavaScript, structures like Object can be mutated (added and removed properties) in runtime. Implementing them as hashtables can lead to performance problems when accessing properties of these structures. Compare that to compiled languages like Java in which instances of a class can be allocated with all its members in a single chunk of memory and accessing properties of objects consists in adding an offset to the object’s pointer.

V8 optimizes the Object allocation by creating hidden classes. It makes use of the fact that properties are mutated in the same pattern. For example in

In this case, we always insert the property x and then y. V8 starts with an empty class C0 when the object is first created. When x is assigned, it creates another class C1 with property x and that points to C0. When y is assigned, it creates yet another class C2 with property y that point to C1.

When the Point constructor is finished, it will be an instance of C2. It has to instantiate 3 objects one from C0, then C1, then C2.

Accessing property y is an adding an offset operation, while accessing property x is two offset operations, but still fast than a table lookup. Other instances of Point will share the same class C2. If for some reason we have a point with only x set, then it will be an instance of class C1. This sharing of structures resembles the persistent data structures that we studied previously.

Another interesting property of this method is that it doesn’t need to know in advance whether the code has indeed a pattern in mutating objects. It’s sort of a JIT compilation.

#### Dynamic Machine Code Generation

According to [3]:

V8 compiles JavaScript source code directly into machine code when it is first executed. There are no intermediate byte codes, no interpreter.

#### Efficient Garbage Collection

We wrote about V8’s memory management in a previous post.

### Conclusion

In this post we revisited a bit of the history of JavaScript and ECMAScript. We also studied the event loop model that is part of the spec and finally saw some aspects of one of the JavaScript engines, V8.

### References

• [1] ECMAScript 2016 – Language Specification
• [2] MDN: Concurrency model and Event Loop
• [3] V8 – Design Elements

# Web Workers

In this post we study Web Workers, a technology that allows JavaScript code to run in separate threads. We’ll start by exploring the API with some toy examples and at the end discuss some applications.

### Introduction

By default JavaScript runs in a single (the main thread), which can be a problem for user experience if expensive operations need to be performed in code which would affect responsiveness of the UI.

The thread that runs the web worker code has some environment limitations, which includes no access to the DOM or global objects such as window. [3] contains a list of functions and methods available to a Web Worker.

Besides that, memory is not shared between the threads, having to be explicitly serialized (so it can be cloned) and passed via a method (postMessage()). This can lead to performance issues if the amount of data to be copied is large. In Transferable Objects we’ll discuss an alternative.

Workers might spawn their own workers.

For some reason, the term “web workers” reminds me of these creatures from Spirited Away :)

Let’s work a simple example. Imagine we have two files, main.js and worker.js (in the same directory):

main.js:

// Initializes the worker with the
// JavaScript code in worker.js
var myWorker = new Worker("worker.js");

// Post a message to the worker
myWorker.postMessage("Hello worker!");

// Callback that gets called when a
// message is received from the worker.
myWorker.onmessage = function(/*MessageEvent*/ message) {
console.log(
'Message received from worker script: ',
message.data
);
}


worker.js

onmessage = function(/*MessageEvent*/ message) {
console.log(
'Message received from main script: ',
message.data
);
postMessage("Hello main!");
}


### Transferable Objects

By default data is copied when sending information back and forth between the main thread and the worker, using a process called structural cloning.

The serialization/deserialization can be expensive, for which case there is an alternative: transferrable objects. More specifically, we can work with ArrayBuffers, which can be “transferred” instead of cloned, which is a more performant operation, more precisely, O(1).

We’ll first cover ArrayBuffers and then see how to apply it in a context of Web Workers.

#### ArrayBuffer

According to [5]:

The ArrayBuffer is a data type that is used to represent a generic, fixed-length binary data buffer. You can’t directly manipulate the contents of an ArrayBuffer; instead, you create a typed array view or a DataView which represents the buffer in a specific format, and use that to read and write the contents of the buffer.

ArrayBuffer basically represents an unstructured array of bits, which, to have any meaning/interpretation, needs a view, which can be an array of 32-bit unsigned ints or 16-bit unsigned ints. In the example below we create an array buffer of 100 bytes.


// Number of bytes or number of elements
var buffer = new ArrayBuffer(100);

// A 32-bit unsigned int array of length 10 (i.e. 40 bytes), starting
// from byte 0 at the array buffer
var int32View = new Uint32Array(buffer, 0, 10);
// A 16-bit unsigned int array of length 20 (i.e. 40 bytes), starting
// from byte 0 at the array buffer
var int16View = new Uint16Array(buffer, 0, 20);

// Fill in the 16-bit array with 0, 1, 2...
for (var i = 0; i < int16View.length; i++) {
int16View[i] = i;
}

// The memory is shared because we're reading from the same chunk of
// the byte array.
for (var i = 0; i < int32View.length; i++) {
console.log("Entry " + i + ": " + int32View[i]);
}


This is a very interesting model. In ArrayBuffers one explicitly work with the serialized form of the data and create views on top of them. I’m used to work with the views-first, that is, create a class representing some data and eventually add serialization/deserialization methods. One advantage of working with serialized data is that we don’t need to write the serialization methods, only the deserialization. The major disadvantage is that you need to know upfront how much memory you’ll have to use.

#### Transferrable objects

We can extend the example above to be used between a worker and the main thread.

worker.js:

var buffer = new ArrayBuffer(100);
var int16View = new Uint16Array(buffer, 0, 20);

for (var i = 0; i < int16View.length; i++) {
int16View[i] = i;
}

console.log('array buffer size', buffer.byteLength); // 100
postMessage(buffer, [buffer]);
console.log('array buffer size?', buffer.byteLength); // 0


and in the main.js:

...
myWorker.onmessage = function(e) {
buffer = e.data;
// Number of bytes or number of elements
var int32View = new Uint32Array(buffer, 0, 10);

for (var i = 0; i < int32View.length; i++) {
console.log("Entry " + i + ": " + int32View[i]);
}
}


By logging the output to the console, we can see the main thread received the values written to the array buffer by the worker and after the worker transferred the buffer data, it was emptied out.

Note in the postMessage() API, we provide buffer as a the first parameter and then it also appears in the list indicating it should be transferred, not copied. Having to pass it twice is a bit confusing in my opinion, but this is to allow the example below, in which the objects transferred are nested inside another structure (in this case an object) and we want to transfer both buffer1 and buffer2 but not the top-level object. I’m not sure which use case the API designers had in mind, though.

postMessage(
{'key1': buffer1, 'key2': buffer2},
[buffer1, buffer2]
);


### Error Handling

If any errors are uncaught by the worker, it can be caught from the main thread through the onerror callback:

myWorker.onerror = function(e) {
console.log('an error occurred:', e);
e.preventDefault();
}


Where e is an instance of ErrorEvent. We can simulate an error on the worker.js code:

throw new Error("Some error occurred");


### Terminating

The main thread can terminate the worker

worker.terminate();


or the worker can terminate itself:

close();


### Applications

A lot of the examples using Web Workers involve doing some fake expensive calculation in the worker thread, but I haven’t found any real-world application.

StackOverflow offers some ideas, including one that is dimissed as bad uses of Web Workers (polling) or from projects that are long defunct (Mozilla Skywriter). The main issue is that most of time heavy processing is done on the server.

One idea that came to mind is to use web-workers in React. React defers a lot of DOM work to the end by working with the concept of a virtual DOM. Web-workers don’t have access to the DOM but they do have it for virtual DOMs. Turns out this idea has been explored already [7, 8] but there were some technical difficulties in implementing events.

### Conclusion

In this post we studied Web Workers and some examples utilizing it. I learned a few other related topics like ArrayBuffers, and Compositor Workers. I was a bit disappointed with the lack of compelling applications using Web Workers. I’ll try it out in some of my projects and see if I can any benefits from it.

### References

Some of the code presented in this post is available on Github.

[1] MDN – Using Web Workers
[2] HTML5 Rocks – The Basics of Web Workers
[3] MDN – Functions and classes available to Web Workers
[4] Google Developers – Transferable Objects: Lightning Fast!
[5] MDN – JavaScript typed arrays
[6] StackOverflow – What are the use-cases for Web Workers?
[7] React Custom Renderer using Web Workers
[8] GibHub React Issues: Webworkers #3092
[9] Compositor Workers Proposal

# Developing a project in Javascript

I’ve worked with several small Javascript side-projects. The amount of Javascript libraries and frameworks is overwhelming, especially in recent times.

In the past, I would write a couple of stand-alone Javascript files from scratch. As applications get bigger and more complex, new libraries for improving project development have been created.

I decided to look around for best practices to develop open source JavaScript applications these days. This post is a set of notes describing my findings.

We’ll discuss libraries that solves different needs for software projects including libraries, modularization, automated building, linter and finally testing frameworks.

### Packages/Libraries

Javascript doesn’t have an official package management. There has been an effort to standartize how Javascript libraries are distributed. With Node.js, came its package manager that npm (node package manager), that was initially indented for Node.js packages, but can also be used for general libraries, independent of Node.js itself.

To work with npm, we need to write a configuration file, called package.json. In this file, which is a JSON, we can define metadata when building a library, including title, version and the dependencies of other libraries. A sample configuration looks like this:

{
"name": "my_project",
"version": "0.0.1",
"description": "Description of my project",
"devDependencies": {
"browserify": "~5.11.0",
"uglifyify": "^2.5.0"
}
}


#### Dependencies

In the dependencies, we have to specify the versions. A version (more specifically semantic version or semver) consists of three parts numbers separated by '.'. The last number should be bumped on small changes, like bug fixes, without change on functionality. The middle number, aka minor version, should be bumped whenever new features are added, but that are back-compatible. Finally, the first number, aka major version, should be bumped whenever back-incompatible changes are made [1].

In package.json, you can specify a hard-coded version number or be more relaxed. If we use the '~' in front of the version, for example ~5.11.0, it means we accept the most recent version of form 5.11.x. On the other hand, if we use the '^', for example ^2.5.0, we accept the most recent version of the form 2.x.x.

The dependencies of a package can be either production or development dependencies. In our case, browserify and uglify are only used for building our package and not a dependency our code has, so it doesn’t make sense to ship those to the user of the library.

To parse the configuration in package.json, we can run:

npm install --save-dev


This will download the dependencies listed under devDependencies locally in the directory node_modules (created in the same directory the package.json is). To run the production dependencies, we can do:

npm install --save


### Modules

Modules are useful for splitting code in related units and enables reuse. JavaScript doesn’t have a native module system, so some libraries were built to address the modularization problem. There are three main types of module systems around: AMD (Asynchronous Module Definition), CommonJS and the ES6 loader. Addy Osmani discusses the differences between those in [2].

There are several implementations for modules, including RequireJS (AMD), browserify (uses the node.js module system, which uses CommonJS). SystemJS is able to work with all these different types.

I had been working with browserify, but it seems better to adopt the ES6 standards, so I’ve switched to SystemJS. Another advantage of SystemJS is that is also allows ES6 syntax by transpiling the code using BabelJS.

To use SystemJS we need to define a configuration file (analogous to package.json), named config.js (don’t worry about it for now).

#### Exporting

Named exports. We can have multiple export statements within a single file or provide all exports within a single statement [3]. Example:

/**
* my_module.js
*/
function foo() {
console.log('foo');
}
function bar() {
console.log('bar');
}
// Nested export
export {
foo,
// baz is what will be available externally
bar as baz,
};
// Flat, inline export
export function derp() {
console.log('derp');
}


Default exports. We can export default items in a module (the reason will be clear when we talk about importing next). We show the syntax for both the inline and the named exports:

// Nested
export {
foo as default,
};
// Inline export
export default function() {
console.log('derp');
}


#### Importing

We have 3 basic ways to import from a module.

1. Name all items we want to pick from the module.

import {foo, baz as 'bar'} from 'my_module';


2. Do not provide any specific item, in which case we’ll import the default export:

import the_default_export from 'my_module';
// Equivalent to
import {default as 'the_default_export'} from 'my_module'


3. Import all item from the module under a ‘namespace’, basically

import * as everything from 'my_module'
// 'everything' is an object
// {
//    foo,
//    baz,
//    default,
//    derp
// }


#### NPM Packages

To be able to import NPM packages, we have to download them first and for that we can use the jspm.io tool. For example, I was interested in the point-in-polygon package. Instead of running the npm command, we can use jspm:

// Download jspm
npm install --global jspm
// Install the desired npm package
jspm install npm:point-in-polygon


Running jspm will write to the config.js file (it creates one if it doesn’t exist). This will write a map from where the module got installed and the name you can use in code to import it. Since npm packages use the CommonJS syntax and SystemJS understands it, in code we can simply do:

import {default as pointInsidePolygon} from 'point-in-polygon';


### Building

The process of running commands like SystemJS can be automated. One idea is writing Makefiles to run command line. Another option is to use JavaScript frameworka, such as Grunt and Gulp. In this post we’ll stick to Grunt.

To configure a build, we need to provide another configuration file, called Gruntfile.js (should live in the same directory as the package.json). You provide an object to grunt.initConfig(), which contains tasks configurations.

module.exports = function(grunt) {

grunt.initConfig({
systemjs: {
options: {
'configFile': './config.js'
},
dist: {
'src': '<root JS file>',
'dest': '<compiled file with all dependencies together>'
}
},
});

};

With grunt.registerTask('default', ['systemjs']) we’re telling grunt to run the systemjs task whenever we run grunt from the command line.

It’s possible to run grunt automatically upon changes to JS files via the watch task. First, we need to install the plugin:

npm install grunt-contrib-watch --save-dev


Then we configure it in Gruntfile.js:

grunt.initConfig({
...
watch: {
browser_js: {
files: ['**/*.js', '!node_modules/**/*', '!dist/**/*'],
}
},
});
...


Here taskList is an array of task names. It can be the same one provided to the default task. Make sure to blacklist some directories like dist, which is the output directory of the systemjs task (otherwise we’ll get an infinite loop). Finally we run:

grunt watch


Now, whenever we perform a change to any JS file it will run the task.

### Minification

Since Javascript code is interpreted on the client (browser), the source code must be downloaded from the server. Having a large source code is not efficient from a network perspective, so often these libraries are available as a minified file (often with extension min.js to differentiate from the unminified version).

The source code can be compressed by removing extra spaces, renaming variables, etc, without changing the program. One popular tool to achieve this is UglifyJS.

To use it with Grunt, we can install the grunt-contrib-uglify module:

npm install grunt-contrib-uglify --save-dev


And in our Gruntfile.js:

grunt.initConfig({
...
uglify: {
compact: {
files: {
'./dist/<project>.min.js': ['./dist/<project>.js']
}
}
},
...
}


### Linting

Lint tools help us avoiding bugs, sticking to code conventions and improving code quality. One popular tool for linting is jshint. Other alternatives include jslint. JSHint has a Grunt plugin:

grunt.initConfig({
...
jshint: {
files: [
'**/*.js',
'!node_modules/**/*',
'!jspm_packages/**/*',
'!dist/**/*'
],
options: {
'esnext': true,
}
},
...
}


The basic configuration here makes sure to blacklist “production” directories like node_module and dist. Also, since we’ve been adopting ES6, we can set the esnext flag to tell jshint to account for the new syntax.

We probably don’t want to run the lint every time we update the JS file. We can run it less often, for example before sending code for review. Thus, we can create a separate registry for it using grunt.registerTask('lint', ['jshint']). We can now run jshint via the command line:

grunt lint


### Testing

Another practice to avoid bugs is testing, including unit tests. Again, there are several libraries and frameworks that makes the job of unit testing less painful, for example easy ways to mock dependencies so we can test isolated functionality. In this case, I’ve picked Jest, which has a grunt task available in npm, which we can install via:

npm install grunt-jest --save-dev


(NOTE: this will also install the jest-cli binary which depends on a Node.js version >= 4, so you might need to update your Node.js).

We can configure the grunt task with default configs in the following way:

grunt.initConfig({
...
jest: {
},
...
}


With this setup we can run the following command to run jest tests:

grunt jest


Unfortunately, jest uses the CommonJS require syntax. It used to be possible to use babel-jest but after version 5.0 this setup doesn’t work anymore.

### Conclusion

The JavaScript environment changes extremely fast and it’s very hard to keep on top of the latest frameworks/practices, etc.

To make things worse, for every task like module system, linting, testing, there are many alternatives and none of them is a clear best choice.

I’m happy that there’s an effort of standardization with ES6. I think the more we stick to one convention the more we reduce re-inventing the wheel, the less syntax differences to learn, etc.

### References

[1] Semantic versioning and npm
[2] Writing Modular JavaScript With AMD, CommonJS & ES Harmony
[3] ECMAScript 6 modules: the final syntax

Generation Javascript. Manuel Bernhardt discusses the current state of JavaScript libraries and how the low friction nature of developing in JavaScript has its downsides.

Essential JavaScript Links. Eric Elliott’s list of links to several books, articles and tools concerning to JavaScript. It provides a much more comprehensive list of options for the different topics we covered in this post.

# Notes on how browsers work

In this post, I’d like to share some notes from articles about how modern browser works, in particular the rendering process.

All the aforementioned articles are based on open source browser code, mainly Firefox and Chrome. Chrome uses a render engine called Blink, which is a fork to Webkit used by other browsers like Safari. It seems to be the one with more documentation, so we’ll focus on this particular engine. The main sources are [1, 2 and 3]. They cover this subject in much more depth, so I recommend the read for those interested in more details.

### The Big Picture

First, let’s take a look in an high-level architecture of a browser:

As we said before, our focus will be on the Rendering Engine. According to [1], it consists of the following steps:

and we’ll use these steps as our sections to come.

### 1. Parsing HTML to construct the DOM tree

This step consists in converting a markup language in text format into a DOM tree. An interesting observation is that HTML is not a Context Free Grammar because of its forgiving nature, meaning that parsers should accept mal-formed HTML as valid.

The DOM tree is a tree containing DOM (Document Object Model) elements. Each element corresponds to a tag from the HTML markup.

As we parse the HTML text, we might encounter the tags that specify two common resources that enhance basic HTML: CSS and JavaScript. Let’s cover those briefly:

#### Parsing CSS

CSS is a Context-free Grammar, so Webkit is able to rely on tools like Flex (lexical analysis generator) and Bison (parser generator) to parse the CSS file. The engine uses hashmaps to store the rules, so it can perform quick lookups.

#### Parsing Javascript

When the parser encounters a script tag, it starts downloading (if it’s an external resource) and parsing the JavaScript code. According to specs, downloading and parsing occurs synchronously, blocking the parsing process of the HTML markup.

The reason is that executing the script might trigger the HTML body to be modified (e.g. via document.write()). If the JavaScript doesn’t modify the HTML markup, Steve Souders suggests moving the script tags to the bottom of the page or adding the defer attribute to the script tag [4]. He has two test pages to highlight the load times for these distinct approaches: bottom vs. top.

In practice, according to Garsiel [1], browsers will do speculative parsing, trying to download script files in parallel to the main HTML parsing. This process does not start though until all stylesheet files are processed.

### 2. Render tree construction

While constructing the DOM tree, the browser also builds another tree called render tree. Each node in this tree, called render object, represents a rectangular area on the screen.

There’s not necessarily a 1:1 correspondence between DOM nodes and render nodes. For example a select tag has multiple render nodes, whereas hidden DOM elements (with the CSS property display set to none) do not have a corresponding render node.

Since each node represents a rectangle, it needs to know its offset (top, left) and dimensions (height, width). These values depend on several factors, including the CSS properties like display, position, margin and padding and also the order in which they appear in the HTML document.

The process of filling out these parameters is called the layout or reflow. In the next section we’ll describe this process in more details.

### 3. Layout of the render tree

Rectangle size. For each node, the size of the rectangle is constructed as follows:

* The element’s width is whatever value is specified in the CSS or 100% of the parent’s width
* To compute the height, it first has to analyse the height of its children, and it will have the height necessary to enclose them, or whatever value is specified in the CSS.

A couple of notes here: the height is calculated top-down, whereas the width is calculated bottom-up. When computing the height, the parent only looks at the immediate children, not descendants. For example, if we have

<div style='background-color: green; width: 400px'>
A
<div style='background-color: red; width: 500px; height: 100px'>
B
<div style='background-color: blue; height: 150px'>
C
</div>
</div>
</div>


The green box (A) will have the height enough to contain the red box (B), even though the blue box (C) takes more space than that. That’s because B has a fixed height and C is overflowing it. If we add the property overflow: hidden to B, we’ll see that box A is able to accommodate B and C.

Some properties may modify this default behavior, for example, if box C is set to position absolute or fixed, it’s not considered in the computation of B’s height.

Rectangle offset. To calculate the offset, processes the nodes of the tree in order. Based on the elements that were already processed, it can determine its position depending on the type of positoning and display the element has. If it’s display:block, like a div with default properties, it’s moved to the next and the left offset is based on the parent. If it’s display is set to inline, it tries to render in the same line after the last element, as long as it fits within the parent container.

Some other properties besides display can also change how the position is calculated, the main ones being position and float. If position is set to absolute and the top is defined, the offset will be relative to the first ancestral of that component with position set to relative. The same works for the property left.

Whenever a CSS changes happens or the DOM structure is modified, it might require a new layout. The engines try to avoid the re-layout by only processing the affected subtree, if possible.

### 4. Painting the render tree

This last step is also the most complex and computationally expensive. It requires several optimizations and relies on the GPU when possible. In this step, we have two new conceptual trees, the render layer tree and the graphics layer tree. The relationship of the nodes in each tree is basically:

DOM Element > Render Object > Render Layer> Graphics Layer.

Render layers. exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc. A render layer contains one or more render object .

Graphics layers. uses the GPU for painting its content. One can visualize layers by turning on the “Show composited layer borders” in Chrome DevTools (it’s under the Rendering Tab, which is only made visible by clicking on the drawer icon >_). By default, everything is rendered in a single layer, but things like 3D CSS transforms trigger the creation of new layers. Wiltzius [2] provides a sample page where one can visualize an extra layer:

http://www.html5rocks.com/en/tutorials/speed/layers/onelayer.html

A render layer either has its own layer, or inherits one from its parent. A render layer with its own layer is called compositing layer.

Rendering process. occurs in two phases: painting, which consists of filling the contents of a graphics layer and compositing (or drawing) which consists in combining graphics layers into a single image to display in the screen.

### Conclusion

I was initially planning to study general JavaScript performance profiling. In researching articles on the internet, I’ve found a number of them are related to making websites more responsive by understanding and optimizing the browser rendering process. I’ve realized there was a lot to be learned about this process and I could benefit from studying this subject.

A lot of the articles come from different sources, but a few authors seem to always been involved in them. Paul Irish and Paul Lewis are two of the folks who I’ve seen in several articles (see Additional Resources), and they have a strong presence online and might be worth following them if you’re interested in the subject.

### References

[1] HTML5 – How Browsers Work: Behind the scenes of modern web browsers
[2] HTML5 Rocks – Accelerated Rendering in Chrome
[3] Chromium Design Documents – GPU Accelerated Compositing in Chrome
[4] High Performance Web Sites: Essential Knowledge for Front-End Engineers

Chrome Developer Tools (for rendering):