JavaScriptCore (JSC) is the powerful JavaScript engine underlying WebKit. It’s very fast primarily because of the four-tier compiler at its heart. In this article I will explain how I peeked under the hood at how this works.

JavaScriptCore does not directly execute the JavaScript text it is provided. A script is parsed into bytecode and thrown away. It is from the bytecode representation that all the magic happens.

There are four compilation tiers to JavaScriptCore.

  • LLInt — the Low Level Interpreter
  • Baseline JIT — some quick-wins to speed up js
  • DFG JIT — a Data Flow Graph
  • FTL JIT — the Fourth Tier LLVM (Low Level Virtual Machine)

The first tier interprets the bytecode via the Low Level Interpreter (LLInt.) This is intended to have no start-up cost — it gets JavaScript code running as soon as it’s parsed. The trade-off is code execution happens to be relatively slow.

JavaScriptCore constantly watches for heavily used branches of code. As these are identified they are handed down to lower tiers to be re-compiled. It is this piece of JavaScriptCore that I want to look at now.

Prerequisites

If you are following along make sure you install the JavaScriptCore REPL so you can invoke it on the command line. I used the WebKit nightly build for all the examples in this post so that I have access to ES6. When you see jsc-nightly below know that it is aliased to the nightly JSC runtime.

A Simple Script

I started with a small example script.

let square = (x) => {
  return x * x;
}

for (var i = 0; i < 10; i += 1) {
  square(i);
}

This script defines a function called square and repeatedly calls it a few times. I saved this as square.js.

I then used JavaScriptCore to compile this script and output profiling data by specifying the -p argument and giving it a filename. The profile data contains information on the generated bytecodes and the compilation steps the code is run through. This is the command I executed.

jsc-nightly -p square.profile square.js

The json output is not formatted so I used this to clean it up a bit.

cat square.profile | python -mjson.tool > square.formatted.profile

This resulted in neatly formatted and indented json that I could easily peruse in a text editor. Here is a simplified view of the json output.

{
    "bytecodes": [],
    "compilations": []
}

The bytecodes array contains a line-by-line description of the bytecode generated by the parser.

Notice compilations is empty. This is because the script was simple enough the compilers were never triggered and it only ran in the LLInt. No compilation was performed.

Next I wanted to try triggering some advanced compilation.

Triggering JIT

By modifying the script slightly I caused JavaScriptCore to trigger the second tier, the Baseline JIT compiler. The WebKit wiki says this tier “kicks in for functions that are invoked at least 6 times, or take a loop at least 100 times (or some combination - like 3 invocations with 50 loop iterations total)”.

On my old laptop I had to loop at least 34 times to trigger the JIT compiler. But this only compiled the square function. I had to loop about 500 times before I could get the for loop to compile as well.

let square = (x) => {
  return x * x;
}

for (var i = 0; i < 500; i += 1) {
  square(i);
}
jsc-nightly -p square.profile square.js
cat square.profile | python -mjson.tool > square.formatted.profile

Inspecting the profile I saw the compilations array contained a couple of entries, proving the Baseline JIT was triggered.

{
    "bytecodes": [],
    "compilations": [
        {
            "bytecodesID": 1,
            "compilationKind": "Baseline",
            
        },
        {
            "bytecodesID": 0,
            "compilationKind": "Baseline",
            
        }
    ]
}

DFG

The third tier offers much more advanced JavaScript compilation. It relies on the information from the previous tiers to inform its work. Again from the WebKit wiki we can see that the DFG “kicks in for functions that are invoked at least 60 times, or that took a loop at least 1,000 times. Again, these numbers are approximate and are subject to additional heuristics.”

Notice that it says at least 1,000 times. On my machine even 10,000 was not enough to trigger the DFG, so I bumped it up to 1 million loops.

let square = (x) => {
  return x * x;
}

for (var i = 0; i < 1000000; i += 1) {
  square(i);
}
jsc-nightly -p square.profile square.js
cat square.profile | python -mjson.tool > square.formatted.profile

The json now includes four entries in compilations. The first two are the Baseline JIT as seen before. The second two are DFG JIT compilations.

{
    "bytecodes": [],
    "compilations": [
        ,
        ,
        {
            "bytecodesID": 1,
            "compilationKind": "DFG",
            
        },
        {
            "bytecodesID": 0,
            "compilationKind": "DFG",
                
        }
    ]
}

Perhaps even more interesting is both DFG compilations contain entries in their osrExitSites array.


"osrExitSites": [
    [
        "0x54a41d200297"
    ]
],

OSR stands for On Stack Replacement. This means that JavaScriptCore can swap out the previous Baseline JIT compilation with a new, faster DFG JIT version. However, each compilation tier produces a more rigid version of the code. In this case the DFG version was compiled for an integer argument. If I tried to pass in a string it would not work and instead JavaScriptCore would perform an OSR Exit. This means it would leave the DFG compilation and fall back to a previous tier that was able to handle the given argument type.

This diagram captures the flow of code in JavaScriptCore. Code can make it into the DFG execution phase and fall back to the Baseline execution. This ability to bail out is necessary when providing fast, compiled versions of JavaScript, due to the language’s dynamic nature. As long as the DFG version receives the expected argument types it will execute. Otherwise it must fall back to the slower code that can accept dynamic types.

                                               ┌───────┐      
                                               │Compile│      
                                             ┌▶│  to   │─┐    
                                             │ │  DFG  │ │    
                                             │ └───────┘ │    
                                             │           ▼    
┌──────────┐  ┌───────┐  ┌──────────┐  ┌──────────┐─▶┌───────┐
│  Parse   │  │       │  │ Compile  │  │ Execute  │  │Execute│
│    to    │─▶│ LLInt │─▶│    to    │─▶│ Baseline │  │  DFG  │
│ bytecode │  │       │  │ Baseline │  │          │  │       │
└──────────┘  └───────┘  └──────────┘  └──────────┘◀─└───────┘

FTL JIT

The exact heuristics that cause JavaScriptCore to compile with each tier are nuanced and can be affected by variables like execution count, memory pressure, and time. While FTL requires tens or hundreds of thousands of loops to kick-in, it takes alot of time to perform the compilation. So even though FTL gets triggered, the slower version will continue to be used until the FTL completes. And it may not have enough time to complete. The WebKit wiki notes:

the FTL only kicks in for functions that run many times - 100,000 executions is typically required to ensure that the function is FTL-compiled. Because FTL compilation is queued up and done concurrently, for simple programs even 100,000 executions may not be enough to really trigger the FTL: the program may exit while the FTL compilation task is still queued or on-going

Using the simple square example from before and increasing the loop count to 1 billion still does not result in FTL JIT compilation. It simply executes too quickly. Instead I tried this slightly more complex bit of code borrowed from the WebKit blog.

var Class = {
  create: function() {
    function constructor() {
      this.initialize.apply(this, arguments);
    }
    return constructor;
  }
}

var Point = Class.create();
Point.prototype = {
  initialize: function(x, y) {
    this.x = x;
    this.y = y;
  }
};

for (var i = 0; i < 200000; i++) {
  var p = new Point(1, 2);
}

I saved this out as point.js and profiled as before.

jsc-nightly -p point.profile point.js
cat point.profile | python -mjson.tool > point.formatted.profile

In the tenth entry within the compilations array I finally found an example of FTL JITed code.

{
    "bytecodes": [],
    "compilations": [
        ,
        {
            "bytecodesID": 2,
            "compilationKind": "FTL",
            
        },
        
    ]
}

Inside the description there was the hint “Generated FTL JIT code for constructor…” so I could identify which piece of the JavaScript code was expensive enough to warrant the FTL.

It was about this time in my exploration of JavaScriptCore that I discovered a far more efficient way to examine a profile.

Making Sense of a Profile

I only discovered this tool in the midst of writing this article because I was exploring JavaScriptCore source. The WebKit source contains a script named display_profiler_output located in the source tree at Tools/Scripts. This simple, stand-alone ruby script can help make sense of the profile output. After downloading it and installing the prerequisite gems I linked it to my shell for easy access.

ln -s display-profiler-output /usr/local/bin/jsc-display-profiler-output

I then ran the script…

jsc-display-profiler-output point.profile

Which revealed some good information.

This dropped me into an active environment where I could start exploring the profile output. Here is the initial printout.

    CodeBlock      #Instr        Source Counts             Machine Counts      #Compil  Inlines  #Exits   Last Opts                                           Source
                              Base/DFG/FTL/FTLOSR       Base/DFG/FTL/FTLOSR            Src/Total         Get/Put/Call
constructor#CKmjq4   83       27674/72845/99449/0       27674/72845/99449/0       3       1/3       0       2/2/1     function constructor() { this.initialize.apply(this, arguments); }
initialize#CwBMGD    30       27673/72845/99449/0           27673/0/0/0           2       2/5       0       0/2/0     function (x, y) { this.x = x; this.y = y; }
 <global>#B9nPNK    260          199506/0/0/0               199506/0/0/0          4       0/0       0       3/4/2     'use strict'; /* ``` jsc-nightly -p point.profile point.js cat point.profile | python
  create#AuT3qS      15             0/0/0/0                   0/0/0/0             0       0/0       0        N/A      function () { function constructor() { this.initialize.apply(this, arguments); } retur

The “Source Counts” column shows each compiler that is available in JavaScriptCore and how many times each part of the code was run in each compiler. (It is possible to compile JavaScriptCore without some of the compilers, so they won’t always show up in a profile.)

Two things stood out as I studied the “Source Counts” column. First, the numbers don’t add up to the total loop count. The script looped 200,000 times. But the numbers for the constructor only add up to 199,968. It turns out this is because the LLInt code is not represented here. This means the constructor executed 32 times in the interpreted mode before the Baseline JIT took over.

Second, there is an extra “FTLOSR” column beyond the “FTL”. That confused me until I studied the WebKit introduction to the FTL JIT where it discusses something called “hot-loop transfer”. (It’s still confusing, actually.) Even though the FTL has compiled the code, it might not be getting called all the time. The FTLOSR is for this special kind of compilation where a bit of code can be entered via an alternate path.

Lastly I want to note that the “CodeBlock” column contains keywords from our code with some cryptic hashes. These are used to reference those specific codeblocks when getting information from the profile. For example to access the profiling data for the constructor you can either say profiling constructor or profiling #CKmjq4. Both will print out the full profiling inforamation for that code block.

This barely scratched the surface of the display-profiler-output script. But, seriously, this article is far too long as it is. Perhaps in another article I will explore its power more fully. If you are hacking on your own try running help to see all the available options.

Further Reading

The following resources offer a comprehensive, albeit highly technical, overview of the JavaScriptCore framework and it’s four-tier compilation process.