Emscripten & asm.js:
C++'s role in
the modern web

Alon Zakai / @kripken

All major web browsers are written in C++

For the obvious reasons:
fast, familiar, library support

For the same reasons, people want to use C++ to write web content too, that is, websites

That's what this talk is about

The Web

Largest open platform in existence

Modern, standards-compliant websites are built using HTML, CSS, and JavaScript (JS)

No C++ there :(

What about non-standardized approaches (ActiveX, Flash/Alchemy, PNaCl/PPAPI)?

Plugins and proposals for entirely new technologies on the web have failed to reach significant adoption or standardization, for both technical and non-technical reasons

And plugins are on the way out (no plugins on iPhone/iPad, etc.)

This is a good trend - standardization is why websites work on both your laptop and your phone

...where does that leave C++, then?

Well, JavaScript is already standardized, so how about if we... compile C++ into that?

This has happened with many languages, in fact:

  • Java
  • C#
  • Python
  • New languages like TypeScript
  • etc.

Compiling to JavaScript?

JavaScript is a dynamic scripting language

  var x = 42;
  var y = "a string";
  var z = x + y; // z = "42a string"

  eval("z = z.substr(1, 2)"); // z = "2a"

  [1, "two", { three: 3 }].forEach(function(item) {
    if (typeof item === typeof z) console.log([z, item]);
  }); // emits ["2a", "two"]

Kind of a weird compiler target...

But from the developer's point of view, compiling to JavaScript can be very conventional!

First, a reminder of compiling to a native executable:

  // hello.cpp
  #include <iostream>
  int main() {
    std::cout << "hello, world!" << std::endl;

  $ g++ hello.cpp -o a.out
  $ ./a.out
  hello, world!

Compiling to JavaScript using Emscripten:

  $ em++ hello.cpp -o a.html
  $ firefox a.html # or any other browser

Here's the output, running in an iframe right here on this web page:

emcc, em++ are drop-in replacements for a native C or C++ compiler, workflow is almost identical

Open source (MIT license) LLVM-based C++ to JavaScript compiler

C++  LLVM  Emscripten  JavaScript

Emscripten builds on the LLVM family of projects:

clang C++ frontend

LLVM optimizer

libc++ C++ standard library

libc++abi low-level C++ support

Currently an out-of-tree fork of LLVM, but we hope to get upstream eventually

Other libraries

Hybrid libc: musl + parts written in JavaScript

Implementations of SDL, OpenGL, etc., using Web APIs

You might be curious at this point what the emitted code looks like...

  // C++
  int func(int *p) {
    int r = *p;
    return calc(r, r << 16);


  // JavaScript
  function func(p) {
    var r = HEAP32[p >> 2];
    return calc(r, r << 16);

Almost direct mapping in many cases

Another example:

  float array[5000]; // C++
  int main() {
    for (int i = 0; i < 5000; ++i) {
      array[i] += 1.0f;


  var buffer = new ArrayBuffer(32768); // JavaScript
  var HEAPF32 = new Float32Array(buffer);
  function main() {
    var a = 0, b = 0;
    do {
      a = (8 + (b << 2)) | 0;
      HEAPF32[a >> 2] = +HEAPF32[a >> 2] + 1.0;
      b = (b + 1) | 0;
    } while ((b | 0) < 5000);

This "style" of code is a subset of JS called asm.js, which we'll discuss more later

So that's what the code can look like. But there are some fundamental differences here...


Need to recompile for another CPU or OS

Single build runs the same everywhere

Single build prevents some optimizations

Undefined Behavior

Has undefined behavior, compiler can use it to optimize

No undefined behavior

dev machine
user machine
C++  ⇒  JS
JS  ⇒  Executable
NO undefined behavior


Applications can use the system libs, access the local filesystem, etc.

Sandboxed, cannot see the machine it is running on

Applications must ship their own system libraries

We "fake" a filesystem to make porting easy

JS sandboxing helps in some unexpected ways!

Remember that we implement C++ functions using JS functions:

  // Simple C++ function compiled to JavaScript
  function func(p) {
    var r = HEAP[p];
    return calc(r, r << 16);

The JS call stack is managed, and unobservable/unmodifiable by executing code

Compiled C++ is therefore immune to some types of buffer overflow attacks

Numeric Types

char, short, int, int64, float, double


We build for a 32-bit target, because 64-bit integers cannot all fit in doubles (but 32-bit ones can)

Perf Model

C-style code maps closely to CPU, higher-level C++ aspects can use RAII, etc., giving predictability

virtual machine (VM), just in time (JIT) compilers w/ type profiling, garbage collection, etc.

But without good and predictable performance, this is pointless...

Historically, JS began as a slow interpreted language

Competition ⇒ type-specializing JITs

Those are very good at implicitly statically typed code

  function add(x, y) {
    x = x | 0;          // | 0  =>  int32
    y = y | 0;
    return (x + y) | 0; // int32 addition!

That's what asm.js is: a subset of JavaScript where all the operations are clearly statically typed

Memory access

  var buffer = new ArrayBuffer(32768);
  var HEAP8 = new Int8Array(buffer);
  var HEAP16 = new Int16Array(buffer);
  var HEAP32 = new Int32Array(buffer);

  function mem_access() {
    return HEAP32[HEAP8[100] >> 2];

Loads in C++ become reads from typed arrays in JS, which become loads in machine code

Emscripten's memory representation/layout is identical to LLVM's, including aliasing, so can use all LLVM opts

Ok, we've just seen some encouraging things about speed, but before we saw some scary things too...?


Performance / time

source: awfy; lower numbers are better

Overall, performance is around 50-67% of native speed, and still improving

Missing pieces remain, like SIMD, but work is underway in the standards bodies

Already fast enough for many applications, even performance sensitive ones like games

In fact, the game industry has been an early adopter of compiling C++ to JavaScript, using Emscripten:




Torque 2D






Products are shipping

Links to online demos from Unity:


Adoption and usage in production show that while JS is a weird compiler target, the results can be robust and reliable

One way we work towards that is fuzzing using csmith; not currently aware of any Emscripten-specific bugs

While there are differences between browsers, having a single build for all of them improves reliability

Emscripten supports practically all C++ features, because clang does

But exception handling isn't something we just get for free

Emscripten supports C++ exceptions... differently

  // C++
  void func() {
    try {
    } catch (Type T) {


  // JS
  void func() {
    invoke(10); // call a function pointer, checking for throw
    var T = get_thrown();
    if (T) {
      if (can_handle(T, 400)) { // 400 -> typeid of Type
      } else {

Here are those runtime functions:
  // JS
  function invoke(ptr) {
    __thrown__ = 0;
    try {
    } catch (e) {
      __thrown__ = e;
  function can_handle(ptr, type) {
    // call into libc++abi internals
  function do_throw(ptr) {
    throw ptr;

We implement C++ exceptions using JS exceptions, JS VM provides stack unwinding

Perf depends on the speed of JS exceptions

We can compile C++ into JavaScript and run it on the web, in a fast and standards-compliant way

JavaScript is a weird - but fun! - compiler target

That's it! Questions?

will tweet link to slides @kripken

http://emscripten.org     http://asmjs.org

Back to Memory

Recall that we represent memory using a single flat array

Pointers are indexes into the array

  var buffer = new ArrayBuffer(32768);
  var HEAP8 = new Int8Array(buffer);
  function compiledCode(ptr) {
    HEAP[ptr] = 12; // write to an address
    return HEAP[ptr + 4]; // read from an address

Which is basically how C and C++ see memory: a pointer can point anywhere in all of memory

But this is not how languages like JavaScript, C#, Java, Python etc. see memory

Each object or array in those languages is in its own "space", which is bounds-checked, and pointers cannot point to anywhere, they are references to distinct objects

This isn't just an academic point!

  // same as before
  var buffer = new ArrayBuffer(32768);
  var HEAP8 = new Int8Array(buffer);
  function compiledCode(ptr) {
    HEAP8[ptr] = 12;
    return HEAP8[ptr + 4];
  // a new function
  function getInput(array) {
    // array is a NORMAL JavaScript array
    // compiledCode cannot refer to it!
    // must *copy* into the HEAP
    var copy = malloc(array.length);
    HEAP8.set(array, copy);
    // 'copy' is now a pointer to a copy of 'array'
    return compiledCode(copy);