Notes on JavaScript Interpreters

In a previous post, Notes on how browsers work, we studied the high-level architecture of a browser, specifically the Rendering Engine. We used the following diagram,

In this post we’ll focus on the JavaScript interpreter part. Different browsers use different implementations of the interpreters. Firefox uses SpiderMonkey, Chrome uses V8, Safari uses JavaScriptCore, Microsoft Edge uses Chakra, to name a few. V8 is also used as a standalone interpreter, most notably by Node.js.

These interpreters usually comply to one of the versions of the ECMAScript, which is a standardization effort of the JavaScript language. ECMA-262 is the document with the specification. As it happens with other languages, from their first inception, design flaws are identified, new development needs arise, so the language spec is always evolving. For that reason, there are a few versions of ECMAScript. Most browsers support the 5th edition, also known as ES5.

There’s already the 7th edition (as of 2016), but it takes time for browsers to catch up. Because of that, JavaScript programs that are capable of translating newer specs into ES5 were created, such as Babel. The advantage of using this technique, also known as transpiling, is that developers can use newer versions of the step, such as ES6, without depending on browser support. Disadvantages include the extra complexity by adding a new step in the deployment process and it makes harder to debug since it’s hard to map errors that happen in the transformed code back to the original source.

Section 8.4 of the ECMA-262 describes the execution model of JavaScript:

A Job is an abstract operation that initiates an ECMAScript computation when no other ECMAScript computation is currently in progress. A Job abstract operation may be defined to accept an arbitrary set of job parameters.

Execution of a Job can be initiated only when there is no running execution context and the execution context stack is empty. A PendingJob is a request for the future execution of a Job. A PendingJob is an internal Record whose fields are specified in Table 25. Once execution of a Job is initiated, the Job always executes to completion. No other Job may be initiated until the currently running Job completes. However, the currently running Job or external events may cause the enqueuing of additional PendingJobs that may be initiated sometime after completion of the currently running Job.

MDN’s Concurrency model and Event Loop [2] describes this spec in a more friendly way. As in other programming environments such as C and Java, we have two types of memory available: the heap and the stack. The heap is the general purpose memory and the stack is where we keep track of scopes for example when doing function calls.

In JavaScript we also have the message queue, and for each message in the queue there is an associated function to be executed.

JavaScript Execution Model

The event loop

The execution model is also called the event loop because of the high-level working of this model:

When the stack is empty, the runtime processes the first message from the queue. While executing the corresponding function, it adds an initial scope to the stack. The function is guaranteed to execute “atomically”, that is, until it returns. The execution might cause new messages to be enqueued, for example if we call

The callback passed as the first argument to setTimeout() will be enqueued, not stacked. On the other hand, if we have a recursive function, such as a factorial:

The recursive calls should all go to the same stack, so they will be executed to completion, that is until the stack is empty. The calls provided to setTimeout() will get enqueued and be executed only after the value of the factorial gets printed.

In fact, using setTimeout() is a common trick to control the execution order of functions. For example, say that function A called B and B calls another function C. If B calls C normally, then C will have to finish before B returns, as the sample code below:

In case we want to finish executing A first, then B can call C using setTimeout(C, 0), because then it will be enqueued, and then both A and B will finish until a new message is processed from the queue, as the code below:

Web Workers

We discussed Web Workers in a previous post, in which we mentioned that it’s a separate thread that shares no memory with the main thread. In fact, according to [2], it has its own stack, heap and queue. It doesn’t violate the execution model of the main thread because communications must be done via a publisher/subscriber API, which means communicating between two threads is subject to the queueing.

V8

Beyond the spec, each JavaScript engine is free to implement feature the way they want. V8’s architecture is described in their wiki page [3], so we can study some of its key components.

Fast Property Access

In JavaScript, structures like Object can be mutated (added and removed properties) in runtime. Implementing them as hashtables can lead to performance problems when accessing properties of these structures. Compare that to compiled languages like Java in which instances of a class can be allocated with all its members in a single chunk of memory and accessing properties of objects consists in adding an offset to the object’s pointer.

V8 optimizes the Object allocation by creating hidden classes. It makes use of the fact that properties are mutated in the same pattern. For example in

In this case, we always insert the property x and then y. V8 starts with an empty class C0 when the object is first created. When x is assigned, it creates another class C1 with property x and that points to C0. When y is assigned, it creates yet another class C2 with property y that point to C1.

When the Point constructor is finished, it will be an instance of C2. It has to instantiate 3 objects one from C0, then C1, then C2.

Accessing property y is an adding an offset operation, while accessing property x is two offset operations, but still fast than a table lookup. Other instances of Point will share the same class C2. If for some reason we have a point with only x set, then it will be an instance of class C1. This sharing of structures resembles the persistent data structures that we studied previously.

Another interesting property of this method is that it doesn’t need to know in advance whether the code has indeed a pattern in mutating objects. It’s sort of a JIT compilation.

Dynamic Machine Code Generation

According to [3]:

V8 compiles JavaScript source code directly into machine code when it is first executed. There are no intermediate byte codes, no interpreter.

Efficient Garbage Collection

We wrote about V8’s memory management in a previous post.

Conclusion

In this post we revisited a bit of the history of JavaScript and ECMAScript. We also studied the event loop model that is part of the spec and finally saw some aspects of one of the JavaScript engines, V8.

References

  • [1] ECMAScript 2016 – Language Specification
  • [2] MDN: Concurrency model and Event Loop
  • [3] V8 – Design Elements

Web Workers

In this post we study Web Workers, a technology that allows JavaScript code to run in separate threads. We’ll start by exploring the API with some toy examples and at the end discuss some applications.

Introduction

By default JavaScript runs in a single (the main thread), which can be a problem for user experience if expensive operations need to be performed in code which would affect responsiveness of the UI.

The thread that runs the web worker code has some environment limitations, which includes no access to the DOM or global objects such as window. [3] contains a list of functions and methods available to a Web Worker.

Besides that, memory is not shared between the threads, having to be explicitly serialized (so it can be cloned) and passed via a method (postMessage()). This can lead to performance issues if the amount of data to be copied is large. In Transferable Objects we’ll discuss an alternative.

Workers might spawn their own workers.

For some reason, the term "web workers" reminds me of these creatures from Spirited Away

For some reason, the term “web workers” reminds me of these creatures from Spirited Away :)

Let’s work a simple example. Imagine we have two files, main.js and worker.js (in the same directory):

main.js:

// Initializes the worker with the
// JavaScript code in worker.js
var myWorker = new Worker("worker.js");

// Post a message to the worker
myWorker.postMessage("Hello worker!");

// Callback that gets called when a 
// message is received from the worker.
myWorker.onmessage = function(/*MessageEvent*/ message) {
  console.log(
    'Message received from worker script: ', 
    message.data
  );
}

worker.js

onmessage = function(/*MessageEvent*/ message) {
  console.log(
    'Message received from main script: ', 
    message.data
  );
  postMessage("Hello main!");
}

Transferable Objects

By default data is copied when sending information back and forth between the main thread and the worker, using a process called structural cloning.

The serialization/deserialization can be expensive, for which case there is an alternative: transferrable objects. More specifically, we can work with ArrayBuffers, which can be “transferred” instead of cloned, which is a more performant operation, more precisely, O(1).

We’ll first cover ArrayBuffers and then see how to apply it in a context of Web Workers.

ArrayBuffer

According to [5]:

The ArrayBuffer is a data type that is used to represent a generic, fixed-length binary data buffer. You can’t directly manipulate the contents of an ArrayBuffer; instead, you create a typed array view or a DataView which represents the buffer in a specific format, and use that to read and write the contents of the buffer.

ArrayBuffer basically represents an unstructured array of bits, which, to have any meaning/interpretation, needs a view, which can be an array of 32-bit unsigned ints or 16-bit unsigned ints. In the example below we create an array buffer of 100 bytes.


// Number of bytes or number of elements
var buffer = new ArrayBuffer(100);

// A 32-bit unsigned int array of length 10 (i.e. 40 bytes), starting 
// from byte 0 at the array buffer
var int32View = new Uint32Array(buffer, 0, 10);
// A 16-bit unsigned int array of length 20 (i.e. 40 bytes), starting 
// from byte 0 at the array buffer
var int16View = new Uint16Array(buffer, 0, 20);

// Fill in the 16-bit array with 0, 1, 2...
for (var i = 0; i < int16View.length; i++) {
  int16View[i] = i;
}

// The memory is shared because we're reading from the same chunk of 
// the byte array.
for (var i = 0; i < int32View.length; i++) {
  console.log("Entry " + i + ": " + int32View[i]);
}

This is a very interesting model. In ArrayBuffers one explicitly work with the serialized form of the data and create views on top of them. I’m used to work with the views-first, that is, create a class representing some data and eventually add serialization/deserialization methods. One advantage of working with serialized data is that we don’t need to write the serialization methods, only the deserialization. The major disadvantage is that you need to know upfront how much memory you’ll have to use.

Transferrable objects

We can extend the example above to be used between a worker and the main thread.

worker.js:

var buffer = new ArrayBuffer(100);
var int16View = new Uint16Array(buffer, 0, 20);

for (var i = 0; i < int16View.length; i++) {
  int16View[i] = i;
}

console.log('array buffer size', buffer.byteLength); // 100
postMessage(buffer, [buffer]);
console.log('array buffer size?', buffer.byteLength); // 0

and in the main.js:

...
myWorker.onmessage = function(e) {
  buffer = e.data;
  // Number of bytes or number of elements
  var int32View = new Uint32Array(buffer, 0, 10);

  for (var i = 0; i < int32View.length; i++) {
    console.log("Entry " + i + ": " + int32View[i]);
  }
}

By logging the output to the console, we can see the main thread received the values written to the array buffer by the worker and after the worker transferred the buffer data, it was emptied out.

Note in the postMessage() API, we provide buffer as a the first parameter and then it also appears in the list indicating it should be transferred, not copied. Having to pass it twice is a bit confusing in my opinion, but this is to allow the example below, in which the objects transferred are nested inside another structure (in this case an object) and we want to transfer both buffer1 and buffer2 but not the top-level object. I’m not sure which use case the API designers had in mind, though.

postMessage(
  {'key1': buffer1, 'key2': buffer2}, 
  [buffer1, buffer2]
);

Error Handling

If any errors are uncaught by the worker, it can be caught from the main thread through the onerror callback:

myWorker.onerror = function(e) {
  console.log('an error occurred:', e);
  e.preventDefault();
}

Where e is an instance of ErrorEvent. We can simulate an error on the worker.js code:

throw new Error("Some error occurred");

Terminating

The main thread can terminate the worker

worker.terminate();

or the worker can terminate itself:

close();

Applications

A lot of the examples using Web Workers involve doing some fake expensive calculation in the worker thread, but I haven’t found any real-world application.

StackOverflow offers some ideas, including one that is dimissed as bad uses of Web Workers (polling) or from projects that are long defunct (Mozilla Skywriter). The main issue is that most of time heavy processing is done on the server.

One idea that came to mind is to use web-workers in React. React defers a lot of DOM work to the end by working with the concept of a virtual DOM. Web-workers don’t have access to the DOM but they do have it for virtual DOMs. Turns out this idea has been explored already [7, 8] but there were some technical difficulties in implementing events.

Conclusion

In this post we studied Web Workers and some examples utilizing it. I learned a few other related topics like ArrayBuffers, and Compositor Workers. I was a bit disappointed with the lack of compelling applications using Web Workers. I’ll try it out in some of my projects and see if I can any benefits from it.

References

Some of the code presented in this post is available on Github.

[1] MDN – Using Web Workers
[2] HTML5 Rocks – The Basics of Web Workers
[3] MDN – Functions and classes available to Web Workers
[4] Google Developers – Transferable Objects: Lightning Fast!
[5] MDN – JavaScript typed arrays
[6] StackOverflow – What are the use-cases for Web Workers?
[7] React Custom Renderer using Web Workers
[8] GibHub React Issues: Webworkers #3092
[9] Compositor Workers Proposal

Developing a project in Javascript

I’ve worked with several small Javascript side-projects. The amount of Javascript libraries and frameworks is overwhelming, especially in recent times.

In the past, I would write a couple of stand-alone Javascript files from scratch. As applications get bigger and more complex, new libraries for improving project development have been created.

I decided to look around for best practices to develop open source JavaScript applications these days. This post is a set of notes describing my findings.

We’ll discuss libraries that solves different needs for software projects including libraries, modularization, automated building, linter and finally testing frameworks.

Packages/Libraries

npm-logo

Javascript doesn’t have an official package management. There has been an effort to standartize how Javascript libraries are distributed. With Node.js, came its package manager that npm (node package manager), that was initially indented for Node.js packages, but can also be used for general libraries, independent of Node.js itself.

To work with npm, we need to write a configuration file, called package.json. In this file, which is a JSON, we can define metadata when building a library, including title, version and the dependencies of other libraries. A sample configuration looks like this:

{
  "name": "my_project",
  "version": "0.0.1",
  "description": "Description of my project",
  "devDependencies": {
    "browserify": "~5.11.0",
    "uglifyify": "^2.5.0"
  }
}

Dependencies

In the dependencies, we have to specify the versions. A version (more specifically semantic version or semver) consists of three parts numbers separated by '.'. The last number should be bumped on small changes, like bug fixes, without change on functionality. The middle number, aka minor version, should be bumped whenever new features are added, but that are back-compatible. Finally, the first number, aka major version, should be bumped whenever back-incompatible changes are made [1].

In package.json, you can specify a hard-coded version number or be more relaxed. If we use the '~' in front of the version, for example ~5.11.0, it means we accept the most recent version of form 5.11.x. On the other hand, if we use the '^', for example ^2.5.0, we accept the most recent version of the form 2.x.x.

The dependencies of a package can be either production or development dependencies. In our case, browserify and uglify are only used for building our package and not a dependency our code has, so it doesn’t make sense to ship those to the user of the library.

To parse the configuration in package.json, we can run:

npm install --save-dev

This will download the dependencies listed under devDependencies locally in the directory node_modules (created in the same directory the package.json is). To run the production dependencies, we can do:

npm install --save

Modules

Modules are useful for splitting code in related units and enables reuse. JavaScript doesn’t have a native module system, so some libraries were built to address the modularization problem. There are three main types of module systems around: AMD (Asynchronous Module Definition), CommonJS and the ES6 loader. Addy Osmani discusses the differences between those in [2].

There are several implementations for modules, including RequireJS (AMD), browserify (uses the node.js module system, which uses CommonJS). SystemJS is able to work with all these different types.

I had been working with browserify, but it seems better to adopt the ES6 standards, so I’ve switched to SystemJS. Another advantage of SystemJS is that is also allows ES6 syntax by transpiling the code using BabelJS.

To use SystemJS we need to define a configuration file (analogous to package.json), named config.js (don’t worry about it for now).

Exporting

Named exports. We can have multiple export statements within a single file or provide all exports within a single statement [3]. Example:

/** 
 * my_module.js
 */
function foo() {
  console.log('foo');
}
function bar() {
  console.log('bar');
}
// Nested export
export {
  foo,
  // baz is what will be available externally
  bar as baz,
};
// Flat, inline export
export function derp() {
  console.log('derp');
}

Default exports. We can export default items in a module (the reason will be clear when we talk about importing next). We show the syntax for both the inline and the named exports:

// Nested 
export {
  foo as default,
};
// Inline export
export default function() {
  console.log('derp');
}

Importing

We have 3 basic ways to import from a module.

1. Name all items we want to pick from the module.

import {foo, baz as 'bar'} from 'my_module';

2. Do not provide any specific item, in which case we’ll import the default export:

import the_default_export from 'my_module';
// Equivalent to
import {default as 'the_default_export'} from 'my_module'

3. Import all item from the module under a ‘namespace’, basically

import * as everything from 'my_module'
// 'everything' is an object 
// {
//    foo,
//    baz,
//    default,
//    derp
// }

NPM Packages

To be able to import NPM packages, we have to download them first and for that we can use the jspm.io tool. For example, I was interested in the point-in-polygon package. Instead of running the npm command, we can use jspm:

// Download jspm 
npm install --global jspm
// Install the desired npm package
jspm install npm:point-in-polygon

Running jspm will write to the config.js file (it creates one if it doesn’t exist). This will write a map from where the module got installed and the name you can use in code to import it. Since npm packages use the CommonJS syntax and SystemJS understands it, in code we can simply do:

import {default as pointInsidePolygon} from 'point-in-polygon';

Building

The process of running commands like SystemJS can be automated. One idea is writing Makefiles to run command line. Another option is to use JavaScript frameworka, such as Grunt and Gulp. In this post we’ll stick to Grunt.

To configure a build, we need to provide another configuration file, called Gruntfile.js (should live in the same directory as the package.json). You provide an object to grunt.initConfig(), which contains tasks configurations.

module.exports = function(grunt) {

  var taskList = ['systemjs', 'uglify'];

  grunt.initConfig({
    pkg: grunt.file.readJSON('package.json'),
    systemjs: {
        options: {
            'configFile': './config.js'
        },
        dist: {
            'src': '<root JS file>',
            'dest': '<compiled file with all dependencies together>'
        }
    },
  });

  grunt.loadNpmTasks("grunt-systemjs-builder");
  grunt.registerTask('default', ['systemjs']);
};

With grunt.registerTask('default', ['systemjs']) we’re telling grunt to run the systemjs task whenever we run grunt from the command line.

It’s possible to run grunt automatically upon changes to JS files via the watch task. First, we need to install the plugin:

npm install grunt-contrib-watch --save-dev

Then we configure it in Gruntfile.js:

grunt.initConfig({
    ...
    watch: {
      browser_js: {
        files: ['**/*.js', '!node_modules/**/*', '!dist/**/*'],
        tasks: taskList,
      }
    },
});
...
grunt.loadNpmTasks('grunt-contrib-watch');

Here taskList is an array of task names. It can be the same one provided to the default task. Make sure to blacklist some directories like dist, which is the output directory of the systemjs task (otherwise we’ll get an infinite loop). Finally we run:

grunt watch

Now, whenever we perform a change to any JS file it will run the task.

Minification

Since Javascript code is interpreted on the client (browser), the source code must be downloaded from the server. Having a large source code is not efficient from a network perspective, so often these libraries are available as a minified file (often with extension min.js to differentiate from the unminified version).

The source code can be compressed by removing extra spaces, renaming variables, etc, without changing the program. One popular tool to achieve this is UglifyJS.

To use it with Grunt, we can install the grunt-contrib-uglify module:

npm install grunt-contrib-uglify --save-dev

And in our Gruntfile.js:

grunt.initConfig({
    ...
    uglify: {
        compact: {
            files: {
                './dist/<project>.min.js': ['./dist/<project>.js']
            }
        }
    },
    ...
}
grunt.loadNpmTasks('grunt-contrib-uglify');
grunt.registerTask('default', ['systemjs', 'uglify']);

Linting

Lint tools help us avoiding bugs, sticking to code conventions and improving code quality. One popular tool for linting is jshint. Other alternatives include jslint. JSHint has a Grunt plugin:

grunt.initConfig({
    ...
    jshint: {
        files: [
            '**/*.js',
            '!node_modules/**/*',
            '!jspm_packages/**/*',
            '!dist/**/*'
        ],
        options: {
            'esnext': true,
        }
    },
    ...
}
grunt.loadNpmTasks('grunt-contrib-jshint');
grunt.registerTask('lint', ['jshint']);

The basic configuration here makes sure to blacklist “production” directories like node_module and dist. Also, since we’ve been adopting ES6, we can set the esnext flag to tell jshint to account for the new syntax.

We probably don’t want to run the lint every time we update the JS file. We can run it less often, for example before sending code for review. Thus, we can create a separate registry for it using grunt.registerTask('lint', ['jshint']). We can now run jshint via the command line:

grunt lint

Testing

Another practice to avoid bugs is testing, including unit tests. Again, there are several libraries and frameworks that makes the job of unit testing less painful, for example easy ways to mock dependencies so we can test isolated functionality. In this case, I’ve picked Jest, which has a grunt task available in npm, which we can install via:

npm install grunt-jest --save-dev

(NOTE: this will also install the jest-cli binary which depends on a Node.js version >= 4, so you might need to update your Node.js).

We can configure the grunt task with default configs in the following way:

grunt.initConfig({
    ...
    jest: {
    },
    ...
}
grunt.loadNpmTasks('grunt-jest');

With this setup we can run the following command to run jest tests:

grunt jest

Unfortunately, jest uses the CommonJS require syntax. It used to be possible to use babel-jest but after version 5.0 this setup doesn’t work anymore.

Conclusion

The JavaScript environment changes extremely fast and it’s very hard to keep on top of the latest frameworks/practices, etc.

To make things worse, for every task like module system, linting, testing, there are many alternatives and none of them is a clear best choice.

I’m happy that there’s an effort of standardization with ES6. I think the more we stick to one convention the more we reduce re-inventing the wheel, the less syntax differences to learn, etc.

References

[1] Semantic versioning and npm
[2] Writing Modular JavaScript With AMD, CommonJS & ES Harmony
[3] ECMAScript 6 modules: the final syntax

Further Reading

Generation Javascript. Manuel Bernhardt discusses the current state of JavaScript libraries and how the low friction nature of developing in JavaScript has its downsides.

Essential JavaScript Links. Eric Elliott’s list of links to several books, articles and tools concerning to JavaScript. It provides a much more comprehensive list of options for the different topics we covered in this post.

Notes on how browsers work

In this post, I’d like to share some notes from articles about how modern browser works, in particular the rendering process.

All the aforementioned articles are based on open source browser code, mainly Firefox and Chrome. Chrome uses a render engine called Blink, which is a fork to Webkit used by other browsers like Safari. It seems to be the one with more documentation, so we’ll focus on this particular engine. The main sources are [1, 2 and 3]. They cover this subject in much more depth, so I recommend the read for those interested in more details.

The Big Picture

First, let’s take a look in an high-level architecture of a browser:

layers

As we said before, our focus will be on the Rendering Engine. According to [1], it consists of the following steps:

flow

and we’ll use these steps as our sections to come.

1. Parsing HTML to construct the DOM tree

This step consists in converting a markup language in text format into a DOM tree. An interesting observation is that HTML is not a Context Free Grammar because of its forgiving nature, meaning that parsers should accept mal-formed HTML as valid.

The DOM tree is a tree containing DOM (Document Object Model) elements. Each element corresponds to a tag from the HTML markup.

As we parse the HTML text, we might encounter the tags that specify two common resources that enhance basic HTML: CSS and JavaScript. Let’s cover those briefly:

Parsing CSS

CSS is a Context-free Grammar, so Webkit is able to rely on tools like Flex (lexical analysis generator) and Bison (parser generator) to parse the CSS file. The engine uses hashmaps to store the rules, so it can perform quick lookups.

Parsing Javascript

When the parser encounters a script tag, it starts downloading (if it’s an external resource) and parsing the JavaScript code. According to specs, downloading and parsing occurs synchronously, blocking the parsing process of the HTML markup.

The reason is that executing the script might trigger the HTML body to be modified (e.g. via document.write()). If the JavaScript doesn’t modify the HTML markup, Steve Souders suggests moving the script tags to the bottom of the page or adding the defer attribute to the script tag [4]. He has two test pages to highlight the load times for these distinct approaches: bottom vs. top.

In practice, according to Garsiel [1], browsers will do speculative parsing, trying to download script files in parallel to the main HTML parsing. This process does not start though until all stylesheet files are processed.

2. Render tree construction

While constructing the DOM tree, the browser also builds another tree called render tree. Each node in this tree, called render object, represents a rectangular area on the screen.

There’s not necessarily a 1:1 correspondence between DOM nodes and render nodes. For example a select tag has multiple render nodes, whereas hidden DOM elements (with the CSS property display set to none) do not have a corresponding render node.

Since each node represents a rectangle, it needs to know its offset (top, left) and dimensions (height, width). These values depend on several factors, including the CSS properties like display, position, margin and padding and also the order in which they appear in the HTML document.

The process of filling out these parameters is called the layout or reflow. In the next section we’ll describe this process in more details.

3. Layout of the render tree

Rectangle size. For each node, the size of the rectangle is constructed as follows:

* The element’s width is whatever value is specified in the CSS or 100% of the parent’s width
* To compute the height, it first has to analyse the height of its children, and it will have the height necessary to enclose them, or whatever value is specified in the CSS.

A couple of notes here: the height is calculated top-down, whereas the width is calculated bottom-up. When computing the height, the parent only looks at the immediate children, not descendants. For example, if we have

<div style='background-color: green; width: 400px'>
  A
  <div style='background-color: red; width: 500px; height: 100px'>
    B
    <div style='background-color: blue; height: 150px'>
      C
    </div>
  </div>
</div>

The green box (A) will have the height enough to contain the red box (B), even though the blue box (C) takes more space than that. That’s because B has a fixed height and C is overflowing it. If we add the property overflow: hidden to B, we’ll see that box A is able to accommodate B and C.

Some properties may modify this default behavior, for example, if box C is set to position absolute or fixed, it’s not considered in the computation of B’s height.

Rectangle offset. To calculate the offset, processes the nodes of the tree in order. Based on the elements that were already processed, it can determine its position depending on the type of positoning and display the element has. If it’s display:block, like a div with default properties, it’s moved to the next and the left offset is based on the parent. If it’s display is set to inline, it tries to render in the same line after the last element, as long as it fits within the parent container.

Some other properties besides display can also change how the position is calculated, the main ones being position and float. If position is set to absolute and the top is defined, the offset will be relative to the first ancestral of that component with position set to relative. The same works for the property left.

Whenever a CSS changes happens or the DOM structure is modified, it might require a new layout. The engines try to avoid the re-layout by only processing the affected subtree, if possible.

4. Painting the render tree

This last step is also the most complex and computationally expensive. It requires several optimizations and relies on the GPU when possible. In this step, we have two new conceptual trees, the render layer tree and the graphics layer tree. The relationship of the nodes in each tree is basically:

DOM Element > Render Object > Render Layer> Graphics Layer.

Render layers. exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc. A render layer contains one or more render object .

Graphics layers. uses the GPU for painting its content. One can visualize layers by turning on the “Show composited layer borders” in Chrome DevTools (it’s under the Rendering Tab, which is only made visible by clicking on the drawer icon >_). By default, everything is rendered in a single layer, but things like 3D CSS transforms trigger the creation of new layers. Wiltzius [2] provides a sample page where one can visualize an extra layer:

http://www.html5rocks.com/en/tutorials/speed/layers/onelayer.html

A render layer either has its own layer, or inherits one from its parent. A render layer with its own layer is called compositing layer.

Rendering process. occurs in two phases: painting, which consists of filling the contents of a graphics layer and compositing (or drawing) which consists in combining graphics layers into a single image to display in the screen.

Conclusion

I was initially planning to study general JavaScript performance profiling. In researching articles on the internet, I’ve found a number of them are related to making websites more responsive by understanding and optimizing the browser rendering process. I’ve realized there was a lot to be learned about this process and I could benefit from studying this subject.

A lot of the articles come from different sources, but a few authors seem to always been involved in them. Paul Irish and Paul Lewis are two of the folks who I’ve seen in several articles (see Additional Resources), and they have a strong presence online and might be worth following them if you’re interested in the subject.

References

[1] HTML5 – How Browsers Work: Behind the scenes of modern web browsers
[2] HTML5 Rocks – Accelerated Rendering in Chrome
[3] Chromium Design Documents – GPU Accelerated Compositing in Chrome
[4] High Performance Web Sites: Essential Knowledge for Front-End Engineers

Additional Resources

Chrome Developer Tools (for rendering):

* Speed Up JavaScript Execution
* Analyze Runtime Performance
* Diagnose Forced Synchronous Layouts
* Profiling Long Paint Times with DevTools’ Continuous Painting Mode

Google Web Fundamentals:

* Stick to compositor-only properties and manage layer count
* Avoid large, complex layouts and layout thrashing

Browser rendering performance:

* CSS Triggers – To determine whether a CSS property triggers layout
* Accelerated Rendering in Chrome
* The Runtime Performance Checklist
* How (not) to trigger a layout in WebKit

General JavaScript performance (mainly for V8):

* Performance Tips for JavaScript in V8
* IRHydra2 – Displays intermediate representations used by V8

JavaScript Promises

logo

In this post we’ll introduce promises in JavaScript. Promises is a new feature included as part of the ES6 spec draft [1].

First, we’ll give an idea of what promises are good for and then go over simple examples to understand how promises work and also cover some extra methods from the API.

There was an initial standard proposal called Promises/A and a subsequent improvement of that proposal called Promises/A+ [2].

Google Chrome implements the Promises/A+ standard [3]. The code snippets in this post were test on the regular Chrome, version 43.

Introduction

Promises are an abstraction to make working with asynchronous code in a more expressive manner [4]. In synchronous code, we think in terms of return’s for normal execution and throw’s for exceptions. In asynchronous world, the code flow is structured around callbacks, for example, onSuccess or onFailure callbacks.

As a toy example, consider the case where we have to call 3 functions, each depending on the previous one, but each of them can fail for whatever reason and we need to handle exceptions. In the synchronous case, it’s straightforward:

try {
    var a = fetchSomeStuff();
    var b = fetchSomeStuffDependingOnA(a);
    var c = fetchSomeStuffDependingOnB(b);
} catch (ex) {
    handleException(ex);
}

If these functions are asynchronous, we’d have to handle these dependencies via the callbacks, creating a nested set of calls (aka callback hell):

fetchSomeStuff(
    /* onSuccess*/ function (a) {
        fetchSomeStuffDependingOnA(
            a,
            /* onSuccess */ function (b) {
                fetchSomeStuffDependingOnB(
                    b,
                    /* onSuccess */ function (c) {
                        /* Code goes on here */
                    },
                    /* onFailure */ function (ex) {
                        handleException(ex);
                    }
                )
            },
            /* onFailure */ function (ex) {
                handleException(ex);
            }
        );
    },
    /* onFailure */ function (ex) {
	handleException(ex);
    }
);

We could definitely work around the depth of calls by using auxiliary functions, but Promises make use cases like this easier:

fetchSomeStuff().then(
    function (a) {
        return fetchSomeStuffDependingOnA(a);
    }
).then(
    function (b) {
        return fetchSomeStuffDependingOnB(b);
    }
).then(
    function (c) {
        /* Code goes on here */
    }
).catch(
    function (ex) {
        handleException(ex);
    }
);

In this case, we’d have to change the functions to return promise objects.

We’ll next cover small examples exploring the behavior of promises to understand how they work.

Examples

Creating a promise

The promise constructor expects a callback (also called executor). This callback on its turn expects takes two other functions as arguments, resolve() – to be called when a normal execution is ended – and reject() – called when an exception occurs. A sample executor example could be:

function executor(resolve, reject) {
  // do some work
  if (success) {
    resolve(10 /* some value */);
  } else {
    reject(new Error("some error"));
  }
}

which succeeds half of the time and fails the other half. We then use this function to create a new promise object:

var promise = new Promise(executor);

Calling a promise

After instantiating a promise, we can call the then() method from the promise, passing two functions, which we’ll refer to onFulfill and onReject. The onFulfill() function will be called when the promise invokes resolve() is called and the onReject() function one when reject() is called.

promise.then(
  /* onFulfill */
  function(value) {
    console.log('resolved with value: ');
    console.log(value);
  },
  /* onReject */
  function(error) {
    console.log('rejected with error: ');
    console.log(error);
  }
)

Now that we saw the basic syntax for creating and handling promises, let’s delve into more details.

Resolving a promise

When we pass a function (executor) to instantiate a promise, it gets immediately executed. For example, in the code below, it will print “running executor” first and then “constructed”.

var promise = new Promise(function(resolve, reject) {
    console.log("running executor");
});
console.log("constructed");

In the example above we’re not using neither resolve or reject. If we can include a call to resolve() and then invoke the then() method:

var promise = new Promise(function(resolve, reject) {
    console.log("resolving");
    resolve(10);
});
console.log("constructed");

promise.then(function (value) {
    console.log("resolved with " + value);
});

We’ll see that the messages are printed in this order:

> resolving
> constructed
> resolved with 10

Note that the resolve() function was called when constructing the promise, but the action was deferred until we passed a callback to the then() method. If we think in terms of events, we set up the listener after the event was fired, but the action of the listener was fired nevertheless.

When a promise is first created, it’s in a state called pending. When the resolve() function is called, it changes its state to fulfilled, when reject() is called, it changes its state to rejected. When a promise is either fulfilled or rejected, we say it’s settled.

In the example above, by the time we called then(), the promise was already fulfilled, so the action fired right away.

We can simulate calling then() while the promise is pending, that is, before resolve() or reject() is called, by using the setTimeout() function:

var promise = new Promise(function(resolve, reject) {
    setTimeout(
        function() {
	    console.log("resolving");
            resolve(10);
        },
        1000
    );
});
console.log("constructed");

promise.then(function (value) {
    console.log("resolved with " + value);
});

The messages are now printed in this order:

> constructed
// Waits 1 second
> resolving
> resolved with 10

Here the promise was in the pending state and then after a second it got fulfilled, so the callback passed to then() was fired.

With events and listeners, we have to guarantee an event is not fired before the listener is setup, otherwise we have to store that event and process later. This is one problem promise solves for us, as pointed out by [5].

Reactions can be queued

Note that the then() function can be called multiple times for the same promise and each time the callbacks passed to it are enqueued and when the promise is settled, they fire in order. For example:

var promise = new Promise(function(resolve, reject) {
    setTimeout(
        function() {
	    console.log("resolving");
            resolve(10);
        },
        1000
    );
});
console.log("constructed");

promise.then(function (value) {
    console.log("resolved with " + value);
});

promise.then(function (value) {
    console.log("resolved again with " + value);
});

We’ll see both callbacks passed to then are invoked:

> constructed
// Waits 1 second
> resolving
> resolved with 10
> resolved again with 10

Promises can’t be unsettled

The resolve() and reject() calls only take effect if the promise is in a pending state. Remember that the first time we call resolve() or reject() the promise changes its state from pending, so all subsequent calls to resolve() or reject() will be ignored:

var promise = new Promise(function(resolve, reject) {
  resolve(10);
  // Promise is fulfilled, so all the subsequent
  // calls will be ignored
  reject(new Error("error"));
  resolve(20);
});

promise.then(function (value) {
    console.log("resolved with " + value);
});

Chaining promises

The then() method of a promise returns another promise. The value of the promise is the value returned by the handling functions passed to then(). If the handling functions executed normally, then the returned promise is fulfilled with that same value. If the handling function throws, the returned promise is reject with the error thrown.

An example of a the normal execution:

var promise = new Promise(function(resolve, reject) {
    resolve(10);
});
var nextPromise = promise.then(function (x) {
    return x + 1;
});
nextPromise.then(function (x) {
    console.log(x);
});

It’s as if nextPromise was created with:

var nextPromise = new Promise(
   var x = 10; // from the first promise
   x += 1;     // from the 'then()' call
   resolve(x);
);

Here’s another example where the promise handlers throws exceptions:

var promise = new Promise(function(resolve, reject) {
    resolve(10);
});
var nextPromise = promise.then(function (x) {
    throw new Error("error");
    return x + 1;
});
nextPromise.then(
    function (x) {},
    function (e) {
        console.log(e);
    }
);

Resolving a promise with another promise

So far we’ve been using numbers as arguments to the resolve() function in a promise, but any JavaScript value could be used. There is one special case though, which is when the value provided is another promise (there’s another special case when the value is an non-promise object with a then() method – which is also called thenable).

Suppose we have a promise A in which we call resolve() with another promise B. When we call A.then() the the value passed to the onFulfill() callback of the then() method will be the value resolved from the promise A.

Consider the following example:

function getPromise() {
    return new Promise(function(resolve, reject) {
         resolve(10);
    });
}

var outerPromise = new Promise(function (resolve, reject) {
    var innerPromise = getPromise();
    resolve(innerPromise);
});

outerPromise.then(function(x) {
    console.log(x);  // prints 10
});

Here, outerPromise calls resolve with another promise that resolves with a number. The onFulfill() callback passed to then() will receive 10.

Chaining promises with promises

We can combine the two last examples to demonstrate how promises can increase the expressiveness of JavaScript callbacks, as we saw in the beginning of the post:

function createPromise(x) {
    return new Promise(function(resolve, reject) {
        resolve(x + 1);
    });
}

var firstPromise = new Promise(function (resolve, reject) {
    resolve(10);
});

firstPromise.then(function(x) {
    var secondPromise = createPromise(x);
    return secondPromise;
}).then(function(x) {
    console.log(x); // Prints 11
});

Remember that when the callbacks provided to then() returns normally, the value is used to create another promise which automatically calls resolve() with that value. If that vale is a promise, then the value provided to onFulfill() of the next then() will be the value from resolve(), which in this case will be 11.

Other methods from the API

So far we’ve considered only promises that need to be executed sequentially. For those that can be executed in parallel, the Promise class contains the all() method, which will take an array of promises and returns a new promise, which will wait until all input promises are settled. If they were all fulfilled, then it will can resolve with an array with the resolve values. If any of them if rejected, it will call reject() with the error of the first promised to be rejected.

In our examples, we mostly exclusively focused on the “normal” execution flow, in which resolve() and onFulfill() are called. The exception case is very similar, with the reject() and onReject() functions being called. One difference is that reject() might be triggered implicitly, for example if an exception is thrown within a promise or one of the reactions callbacks.

If we want, we can only provide the onFulfill() callback to the then() method, but if we want to provide only the onReject(), we’ll need to pass an empty function as onFulfill(). To cover this case, promises also have the catch() which does essentially this.

Conclusion

We saw that promises can be used to make code dealing with callback more legible and easier to reason about. There are some complexities encapsulate in promise objects and its methods, so it can be a bit daunting to understand how they work behind the scenes. We covered some small examples and proceeded in steps to make it easier to digest.

When writing this post, I’ve initially tried reading the ES6 spec, but it was a bit too abstract to follow. I’ve also found the promise/A+ spec which contains pseudo-code more similar to JavaScript and only describes the then() method behavior.

References

[1] Draft ECMA-262, 6th Edition – Rev 37.
[2] Promises/A+ Spec
[3] API Client Library for JavaScript: Using Promises
[4] Hidden Variables: You’re Missing the Point of Promises
[5] HTML5 Rocks – JavaScript Promises

Notes on Javascript Memory Profiling in Google Chrome

In this post we’ll explore the topic of Javascript memory management, in particular in Google Chrome. This post is based on a talk by Loreena Lee and John McCutchan at Google I/O 2013.

googleio2013

We’ll start by covering some aspects of Google Chrome architecture and then describe some of the main tools it offers for analyzing and debugging memory.

For the experiments, we used Chrome Canary, version 45.0 in a Mac.

Graph representation

The browser maintains a graph representation of its objects. Each node represents an object and directed arcs represent references. There are three types of nodes:

* Root node (e.g. the window object in the browser)
* Object node
* Scalar nodes (primitive types)

We define shallow size as the amount of memory used by the object itself (not counting the size of references).

The retaining path of a node is a directed path from the root to that node. This is used to determine whether a node is still being used, because otherwise it can be garbage collected. The retained size of a node is the amount of memory that would be freed if it was removed (includes the memory from nodes that would be deleted with this node).

A node u is a dominator of a node v, is it belongs to every single path from the root to v.

V8 memory management

V8 is the JavaScript engine that powers Google Chrome. In this section, we’ll describe some of its characteristics.

Data types representation

JavaScript has a few primitive data types. The spec describes them but does not dictates how they should be represented in memory.

This may differ from JavaScript engine to engine. Let’s briefly describe how V8 handles that, according to Jay Conrod [2]:

Numbers: can be represented as Small Integers (SMIs) if they represent integers, using 32 bits. If the number is used as a double type or needs to be boxed (for example, use properties like toString()), they are stored as heap objects.

Strings can be stored in the VM heap or externally, in the renderer’s memory, and a wrapper object is created. This is useful for representing script sources and other content that is received from the Web.

Objects are key-value data structures, where the keys are strings. In practice they can be represented as hash tables (dictionary mode) or more efficient structures when objects share properties with other objects or when they are used as arrays. A more in depth study of these representations can be seen in [2].

DOM elements and images are not a basic type of V8. They are an abstraction provided by Google Chrome and external to V8.

Garbage collection

Objects get allocated in the memory pool until it gets full. At this point, the garbage collector is triggered. It’s a blocking process, so this process is also called the GC pause, which can last for a few milliseconds.

Any object that is not reached from the root can be garbage collected. V8 classify the objects in two categories in regards of garbage collection: young and old generations.

All allocated objects start as young generation. As garbage collections occur, objects that were not removed are moved to the old generation. Young generation objects are garbage collected much more often than old objects.

It’s an optimization that makes sense. Statistically, if an object “survived” to many garbage collections, they are less likely to be collected, so we can make these checks less often.


Now that we have a basic understanding of how Chrome deals with memory, we can learn about some of the UI tools it has for dealing with memory.

The task Manager

This tool displays a list of tabs and extensions currently running in your browser. It contains useful information like the amount of Javascript memory each one is using.

https://developer.chrome.com/devtools/docs/javascript-memory-profiling#chrome-task-manager

The timeline tool

The timeline tool allow us to visualize the amount of memory used over time. The dev tools docs contain an example where we can try out the tool:

https://developer.chrome.com/devtools/docs/demos/memory/example1

We’ll obtain a result similar to this:

x

Memory Timeline

In the “Memory” tab, we can see counts like number of documents, event listeners, DOM nodes and also the amount of memory in the JavaScript heap.

The reason we need a separate count is that DOM nodes use native memory and do not directly affect the JavaScript memory graph [3].

The profile tool – Heap Snapshot

The heap snapshot dumps the current memory into the dev tools for analysis. It offers different views: Summary, Comparison, Containment and Statistics.

The statistics is just a pie chart showing a break down of objects by their types. Not much to see here. Let’s focus on the other three.

1. Summary view

This is the default view. It’s a table representing a tree structure, like the one below.

Summary View

Summary View

We can observe the table above contains the following columns:

* Constructor – the name of the function used to instantiate an object.
* Distance – represents the distance of the corresponding nodes to the root (try sorting by distance. You’ll see the Window objects with distance 1).
* Objects Count – the number of objects created with that particular constructor
* Shallow Size – the amount of memory in bytes an object is using
* Retained Size – accounts for all memory an object refers to

If we follow the example in the docs:

https://developer.chrome.com/devtools/docs/heap-profiling-summary

And take a snapshot, we can locate the Item constructor. We see ~20k objects created and their shallow size (each has 32 bytes).

Snapshot

Snapshot

Detached nodes. Nodes that are not part of the main DOM tree (rooted in the <html /> element) are considered detached. In the example below, we could have a DOM element with ID ‘someID’ and remove it from the main DOM tree.

var refA = document.getElementById('someID');
document.body.removeChild(refA);

At this point it’s considered detached, but it can’t be garbage collected because the variable refA still refers to it (and could potentially reinsert the node back later). If we set refA to null and no other variable refers to it, it can be garbage collected.

Moreover, the someID node might have child nodes that would also not be garbage collected, even though no JavaScript variable refers to it directly (it does indirectly through the refA).

In the profile table, detached nodes that are directly referenced by some JavaScript variable have a yellow background and those indirectly references have a red background.

In the table below, we can see some yellow and red colored nodes:

Yellow and Red nodes

Yellow and Red nodes

Object IDs. When we do a profiling, the Dev tools will attach an ID to each Javascript object and those are displayed in the table to the right of the constructor name, in the form @12345 where 12345 is the ID generated for the object. These IDs are only generated if a profile is made and the same object will have the same ID across multiple profilings.

2. Comparison view.

This view allows comparing two snapshots. The docs also provide an example to try out this tool:

https://developer.chrome.com/devtools/docs/heap-profiling-comparison

If we follow the instructions, we’ll get something like:

Comparison

Comparison

We can see the Delta column, showing the difference in number of objects compared to the previous snapshot, which could help with a memory leak investigation. We also have the amount of memory added (Alloc. Size), freed (Freed Size) and the net difference (Size Delta).

3. Containment view.

The containment view organizes the objects by hierarchy. At the top level we can see several Window objects, one of those corresponding to the page we’re looking. For example, if we take a snapshot of the following page:

https://developer.chrome.com/devtools/docs/heap-profiling-containment

And switch the view to Containment, we’ll see something like:

Containment View

Containment View

The profile tool – Heap allocation

The heap allocation tool (also referred as Object Allocation Tracker in the docs) takes snapshots at regular intervals until we stop recording.

It plots a timeline of bars corresponding to memory allocated during that snapshot. The blue portion displays the amount of memory created during that snapshot that is still alive. The gray portion depicts the memory that has been released since then.

We can do a simple experiment to see this behavior in practice. We create two buttons: one that allocates some memory and append to a given DOM node and another that removes one child of the node. The code for this setup can be found on github/kunigami.

If we keep playing with the buttons while recording, we can see blue bars being created when we click “Append” and graying out when we click “Remove”.

Heap Allocation

Heap Allocation

Conclusion

Google Chrome offers many tools for memory diagnosing and it can be overwhelming to decide which one to use. The timeline view is a good high-level tool to detect memory patterns (especially memory leaks).

After that is spotted, the profiling tools can give more details on what is being allocated. This tool displays a lot of information, which can also be hard to read.

The comparison view is useful in these cases because it only show differences.

The documentation is a bit sparse, but contains good examples to try out the tools. These examples have some instructions on how to use the tools, but usually lack screenshots of the results.

References

[1] Google I/O 2013 – A Trip Down Memory Lane with Gmail and DevTools
[2] jayconrod.com – A tour of V8: object representation
[3] Javascript Memory Profiling
[4] Memory Analysis 101

React.js introduction

reactlogo

React.js is a open source javascript framework created by Facebook. It abstracts a common pattern in dynamic UI applications and enables some syntax sugars like JSX, embedding XML in Javascript.

In the MVC (Model-View-Controller) pattern, React can be thought as the View. It uses the concept of virtual DOM to avoid unnecessary DOM manipulation (which is expensive) and it uses one-way reactive data flow that can the logic of the application [1].

In this post we’ll cover the basics concepts and features of React through a set of small examples.

Setup

React can be easily included with external libraries and is also available as npm packages, but for the sake of simplicity and ease of experimenting with the examples, we’ll be using JSFiddle in this tutorial.

JSFiddle requires some setup to work, especially due to JSX syntax. The easiest way to try the examples yourself is forking the examples we’ll provide here.

The basic example: Hello World

Let’s start with the simplest example: a simple stand alone component that renders the following markup:

<div>Hello World</div>

The Javascript code is the following (jsfiddle):

var HelloWorld = React.createClass({
    render: function() {
        return <div>Hello World</div>;
    }
});
 
React.render(<HelloWorld />, document.body);

There are a couple of observations we can make here. First off, the JSX syntax. Note how we pass an XML tag to the React.render() function. It’s a syntax sugar JSX provides, and it gets transpiled to a function call, in particular:

React.createElement(HelloWorld);

where HelloWorld is the react class we created above. The same is true for the <div> tag in the render() method. In this case though, it’s a base HTML element, so it has a builtin function in React:

React.DOM.div()

Finally, all React classes must implement the render() method, which returns other React components, which will eventually get converted to DOM elements and set as children to document.body. If we inspect the HTML source of the generated page we can see the generated tags:

Screen Shot 2015-01-02 at 9.06.25 PM

Terminology. We refer to the object passed to React.createClass() as the React class and an instance of this class as the React component. When we talk about objects, we are referring to Javascript native objects.

Parametrizing the component using props

A static component is not very flexible, so to be able to customize the component, we can pass parameters. Let’s suppose that instead of rendering Hello World, we want our component to render Hello plus some custom message. We can use props (short for properties) for this (jsfiddle):

var Hello = React.createClass({
    render: function() {
        return <div>Hello {this.props.text}</div>;
    }
});
React.render(<Hello text={"Universe"} />, document.body);

In this version, we are passing a property, text, to the Hello instance, and it’s read within the class using this.props.text. More generally, this.props is an object that contains all properties passed to it.

Even though it’s optional, I think it’s a good practice to list the accepted properties a component can take by setting the propTypes property in the react class definition. It expects an object where the keys are the possible properties and the values their types. The possible types are in React.PropTypes, which include the basic javascript one, but also allows for custom types (like other React classes for example) or multiple types (string or number for example) [2].

In our case, we expect text to be a string, so we can simply do (jsfiddle):

var Hello = React.createClass({
    propTypes: {
        text: React.PropTypes.string
    },
    render: function() {
        return <div>Hello {this.props.text}</div>;
    }
});
 
React.render(<Hello text={"Universe"} />, document.body);

Besides documenting the component, it will add type-checking for free. In this case, passing something other than a string will raise a warning.

Making the component stateful

We can think of props as arguments to the constructor of a class. Classes are usually stateful, that is, they make use of internal variables to encapsulate logic, to isolate implementation details from the external world.

The same idea can be applied to React components. The internal data can be stored in an object called state.

Let’s create a simple element to display the number of elapsed of seconds since the page was loaded (jsfiddle).

var Counter = React.createClass({

    getInitialState: function() {
        return {count: 0};
    },
    
    componentDidMount: function() {
        setInterval(this.updateCount, 1000);
    },
    
    updateCount: function() {
        var nextCount = this.state.count + 1;
        this.setState({count: nextCount});
    },
    
    render: function() {
        return <div>Seconds: {this.state.count}</div>;
    }
});
 
React.render(<Counter />, document.body);

There are a couple of new concepts to be understood in this example. First, we implemented the getInitialState(), which initializes the object this.state when the component is instantiated. In this case we are initializing one state called count. The render() method reads from that variable.

We also implement the componentDidMount() method. We’ll explain what it does in more detail later, but for now, it’s important to know it’s called only once and after the render() method. Here we are using the setInterval() function to execute the updateCount() method every second. In general, if we are passing a callback that is a “method” (function from an object), we usually want to provide the context, which is normally this, so we would need to call

setInterval(this.updateCount.bind(this), 1000);

Instead, since React auto-binds this automatically, we can pass the method without binding [3].

The update() function will read from this.state and increase it. By calling this.setState(), not only it will set this.state.count with the new value, but will also call render() again. As we mentioned before, the second call of render() doesn’t cause the componentDidMount() to be called.

This pattern of calling render() constantly is pretty common in dynamic GUIs and in React it was modelled in a way that render() is a function of this.props and this.state and we don’t have to worry about keeping track of when to re-render the screen after changing data.

React Children

React components can return other custom components in their render() method and also we can nest React components. One simple example is the following (jsfiddle):

var Item = React.createClass({
    render: function() {
        return <li>{this.props.text}</li>;
    }
});

var List = React.createClass({
    render: function() {
        return <ol>{this.props.children}</ol>
    }
});

var Container = React.createClass({    
    render: function() {
        return (
            <List>
                <Item text="apple" />
                <Item text="orange" />
                <Item text="banana" />
            </List>
        );         
    }    
});

React.render(<Container />, document.body);

Here, Container returns two other React components, Item and List, and Item is nested within List. The list of components nested within another component is available as this.props.children, as we can see in the List.render() method.

Updating. We saw that whenever we change states (by calling setState()), it triggers a re-render. When a component is instantiated from the render() method of another component, calling render() will also trigger updates to it (and recursively, the nested components). To illustrate that, let’s modify the example above with the following (jsfiddle):

...
var List = React.createClass({
    render: function() {
        if (this.props.ordered) {
            return <ol>{this.props.children}</ol>;
        } else {
            return <ul>{this.props.children}</ul>;
        }
    }
})

var Container = React.createClass({   
    getInitialState: function() {
        return {ordered: true};
    },
    
    render: function() {
        return (
            <div>
                <List ordered={this.state.ordered}>
                    <Item text="apple" />
                    <Item text="orange" />
                    <Item text="banana" />
                </List>
                <button onClick={this.toggleOrdered}>
                   Click Me
                </button>
            </div>
        );         
    },
    
    toggleOrdered: function() {
        this.setState({ordered: !this.state.ordered});
    }
});
...

There are a couple of new things here. First, we added the native button component and passed a callback to the onClick property. It’s the equivalent to the onclick for DOM elements. Whenever the button is clicked, we toggle the state representing the type of list to render (ordered/unordered).

Whenever the Container.render() method is invoked, the render() method from List and Item is called as well. For the list it makes sense, because its render function depends on props.ordered that is being changed. None of the Items are changing thought, but it gets re-render nevertheless. That’s the default behavior from React. In the “Cached rendering” section, we’ll see how to customize this.

Again, note that a component is only updated if it’s instantiated starting from a render() method. For example, both List and Item are created when Container.render() is invoked. On the other hand, even though Item is used within List (through this.props.children) it’s not instantiated there, so it doesn’t get updated if List.render() is called (jsfiddle).

Virtual DOM

DOM manipulation is usually the most expensive operation in highly dynamic pages. React addresses this bottleneck by working with the concept of virtual DOM. After the first render() gets called, React will convert the virtual DOM to an actual DOM structure after which the component is considered mounted (and componentDidMount() is called).

Subsequent calls of render() changes the virtual DOM, but React uses heuristics to find one what changed between two virtual DOMs and only update the difference. In general only a few parts of the DOM structure is changed, so in this sense React optimizes the rendering process and let us simplify the code logic by re-rendering everything all the time.

This process of updating only part of the DOM structure is called reconciliation. You can read more about it in the docs [4].

We need to be careful with conditional rendering, since it can defeat the purpose of the virtual DOM. We used conditional rendering in the React Children section, in the List.render() function. Let’s create a similar example, to make the problem clearer (jsfiddle):

var Text1 = React.createClass({
    render: function() {
        return <p>Hello</p>;
    }
});

var Text2 = React.createClass({
    render: function() {
        return <p>World</p>;
    }
});

var Container = React.createClass({
    
    getInitialState: function() {
        return {count: 0};
    },
    
    componentDidMount: function() {
        setInterval(this.updateCount, 1000);
    },

    updateCount: function() {
        var nextCount = this.state.count + 1;
        this.setState({count: nextCount});
    },
    
    render: function() {
        var content = null;
        if (this.state.count % 2 == 0) {
            content = <Text1 />;
        } else {
            content = <Text2 />;
        }
        return content;
    }    
});
 
React.render(<Container />, document.body);

Here we introduced two dummy new React classes, Text1 and Text2, which render “Hello” and “World” respectively. The render() function of Container returns one of the other alternated.

If we run this example, we can verify Text1 and Text2 get mounted every time Container.render() is called. The reason is that whenever React runs the diff heuristic, the previous tree is completely different from the other.

One solution is to always render both components, but hide one of them using CSS. A proposed solution for the example above is here.

Cached rendering

As we saw in the React Children section, we may trigger render() of a component even when nothing has changed at all. In 99% of the cases, it’s probably fine, because React will be smart enough to prevent DOM manipulations, which is the expensive part anyway.

In case the render() call is expensive, we can control when it gets called by implementing the shouldComponentUpdate() method. By default, it always returns true, but it receives the next set of props and state, which we can inspect to determine whether anything has changed.

Comparing complex objects can be tricky if we don’t know the structures very well. Also caching in general introduces complexity. For example, whenever we do any changes to our code, we need to make sure to update the shouldComponentUpdate() logic, otherwise the render() function might not be called when it should.

Another issue is that props and state can be mutated without the parent component changing them or without calls to setState(). This can cause unexpected problems. Consider the following example (jsfiddle):

var Container = React.createClass({   
    getInitialState: function() {
        return {data: {key: "value"}};
    },
    
    shouldComponentUpdate: function(nextProps, nextState) {
        return this.state.data.key != nextState.data.key;
    },
    
    render: function() {
        return (
            <div>
                {this.state.data.key}
                <button onClick={this.updateState}>Update</button> 
            </div>
        );         
    },
    
    updateState: function() {
        // Modifying state is anti-pattern. We should clone it!
        this.state.data.key = "new value"; 
        this.setState({data: this.state.data});
    }
});

Here we only have one state, data, an object with a single key. This is a simple enough object to cache, right? But in the example, if you click “Update”, even though we did mutate the state and called setState(), it doesn’t do anything. The reason is that in shouldComponentUpdate(), nextState.data and this.state.data are references to the same object.

One idea is using immutable data structures as state and props. For example, Om is a ClojureScript interface to React and it works with immutable data [5].

Mixing with non-React code

React can be incrementally adopted in any existing Javascript codebase. Inserting Javascript under a DOM element is straightforward, since that’s what React.render() does.

Inserting an existing DOM subtree under a React component requires more work. As we saw, React works with the concept of virtual DOM, but we have access to the actual DOM after the component mounted. In particular, we can do it at the componentDidMount() method. In the example below, we create a toy text DOM node, and insert into the generated div DOM element (jsfiddle).

var Hello = React.createClass({

componentDidMount: function() {
      var root = this.getDOMNode();
      var text = document.createTextNode(" world");
      root.appendChild(text);
    },
    
    render: function() {
        return <div>hello</div>;
    }
});
 
React.render(<Hello />, document.body);

Here we are artificially creating a fake text node, but we could potentially insert an entire subtree under the div element. React enables referring to components by ids. In this case, we just need to add the ref property with an unique identifier. Later, when the component is mounted, the corresponding DOM element reference will be available at the this.refs object. More in refs [6].

One way communication

Some frameworks are created to implement patterns. There’s a natural tradeoff between how much boilerplate it saves someone vs. how much it limits it. React is relatively low-level, and thus it still offers a lot of flexibility. One constraint that is imposes it’s the one-way communication between components, that is, during render, we can only pass information from the parent to the children. It’s possible to pass information from children to parents via callbacks, but that usually triggers re-renders.

I struggled in this paradigm right in the beginning, but in the long run, this constraint makes the code much simpler, especially when we have many components interacting with each other, and tracking down what is affecting what is a pain. With the one-way communication, we tend to concentrate logic in fewer places, and if we think in terms of graph theory, each component bring a node, React forces us to have a tree, whereas in an unconstrained environment, we can have arbitrary graph structures.

To make it clearer, let’s try a very simple example. We want a component with a selector and a list, but depending which value of the selector, the list displays a different set of items (jsfiddle).

var List = React.createClass({
    render: function() {
        var items = {
            color: ['Red', 'Green', 'Blue'],
            fruit: ['Apple', 'Banana', 'Orange']
        };
        var reactItems = items[this.props.type].map(function(item) {
            return <li>{item}</li>;
        }, this);
        return (
            <ul>{reactItems}</ul>
        );
    }
});
var Selector = React.createClass({
    render: function() {
        return (
            <select onChange={this.handleSelection} value={this.props.selected}>
                <option value="color">Color</option>
                <option value="fruit">Fruit</option>
            </select>
        );
    },
    
    handleSelection: function(e) {
        this.props.onChange(e);
    }
});

var Container = React.createClass({ 
    getInitialState: function() {
        return {
            selected: 'fruit'
        }
    },
    
    render: function() {
        return (
            <div>
                <Selector
                  onChange={this.handleSelection} 
                  selected={this.state.selected} 
                />
                <List type={this.state.selected} />
            </div>
        );
    },
    
    handleSelection: function(e) {
        this.setState({selected: e.target.value});
    }
});

Note how we store the selected value not in Selector, but rather in the parent of Selector (Container), because Selector cannot communicate with List directly. Now, Container is the source of truth and both Selector and List only read from this value.

Component Lifecycle

React can be thought as a state machine with many stages. It offers hooks to some these stages, some of which we didn’t cover here, but it’s good to have a picture.

The following chart represents the sequence of function calls (click for a full image). The blue nodes at the top represent external actions and the yellow nodes represent the methods that get called in sequence when that happens:

react

More http://facebook.github.io/react/docs/component-specs.html

Conclusion

In this post we covered a couple of features from React by small examples.

React.js is a pretty neat framework and is very fun to work with. I’m trying to study more advanced use cases, especially regarding exception handling and performance, so if I find anything interesting, I can write new posts.

References

[1] Github – React.js
[2] React.js Docs – Reusable Components
[3] React.js Blog – Autobind by Default
[4] React.js Docs – Reconciliation
[5] The Future of JavaScript MVC Frameworks
[6] React.js Docs – More about refs