Binaryen goes up to 4

The Binaryen Optimizer Goes Up To 4

Alon Zakai / June 2018

Emscripten runs the LLVM and Binaryen optimizers, e.g., emcc -O3 does

LLVM IR

→

LLVM -O3

→

wasm

→

Binaryen -O3

→

final wasm

The Binaryen optimizer has been tuned mostly for this LLVM case, and shrinks the final wasm by 15%

But non-LLVM compilers to wasm are important too! Go, AssemblyScript, etc.

Wasm GC will let some compile-to-js languages become compile-to-wasm languages:

the same
source code

---→

---→

JavaScript

WebAssembly

Possibilities here are mostly non-LLVM: TypeScript, Elm, Reason, etc. — expect to see a lot of activity here!

To accelerate that, Binaryen helps write compilers to wasm:

Simple IR (optional basic blocks, etc.)
Handles wasm binary format details
Easy to use:
- JS bindings (binaryen.js), used by AssemblyScript
- C bindings, used by Asterius (Haskell)
Optimizes & minifies

We want to optimize all compilers' output well, but lots of potential variety:

(call $report
 (loop $loop (result i32) ;; loop nested in call, odd...
  (br_if $loop
   (call $do-work)
  )
  (call $get-result) ;; value flows out through the loop
 )
)

LLVM wouldn't emit that, but another compiler might!

To help here Binaryen has a "flatten" pass:

(call $report
 (loop $loop (result i32)
  (br_if $loop
   (call $do-work)
  )
  (call $get-result)
 )
)

flatten
---→

(loop $loop
 (set_local $0
  (call $do-work)
 )
 (br_if $loop
  (get_local $0)
 )
 (set_local $1
  (call $get-result)
 )
)
(call $report
 (get_local $1)
)

Flat IR is simpler, for example, no nested side effects:

(call $report
 (loop $loop (result i32)
  (br_if $loop
   (call $do-work)
  )
  (call $get-result)
 )
)

flatten
---→

(loop $loop
 (set_local $0
  (call $do-work)
 )
 (br_if $loop
  (get_local $0)
 )
 (set_local $1
  (call $get-result)
 )
)
(call $report
 (get_local $1)
)

Binaryen Optimizer Design

Binaryen IR is almost identical to wasm, for effective minification
No separate "flat" and "full" IRs; flat IR is just a subset
A few passes depend on flat IR (when hard to write a general pass)
Most passes work on all IR — but may get a boost if flat

That boost takes Binaryen up to 4:

-O4

flatten + flat-only passes + -O3

Fully optimizes a bunch of testcases we noticed in AssemblyScript

Not much large non-LLVM code yet to test on, but on fuzz testcases -O4 shrinks an extra 20%

But also 3x slower compilation — shows importance of option to run without flattening!

Binaryen now goes up to 4

Maybe someday we'll go up to 11

That's it, thank you!