The Binaryen Optimizer Goes Up To 4


Alon Zakai / June 2018

Emscripten runs the LLVM and Binaryen optimizers, e.g., emcc -O3 does


LLVM IR LLVM -O3 wasm Binaryen -O3 final wasm

The Binaryen optimizer has been tuned mostly for this LLVM case, and shrinks the final wasm by 15%

But non-LLVM compilers to wasm are important too! Go, AssemblyScript, etc.


Wasm GC will let some compile-to-js languages become compile-to-wasm languages:

the same
source code
---→

---→
JavaScript

WebAssembly

Possibilities here are mostly non-LLVM: TypeScript, Elm, Reason, etc. — expect to see a lot of activity here!

To accelerate that, Binaryen helps write compilers to wasm:


  • Simple IR (optional basic blocks, etc.)
  • Handles wasm binary format details
  • Easy to use:
  • Optimizes & minifies

We want to optimize all compilers' output well, but lots of potential variety:

(call $report
 (loop $loop (result i32) ;; loop nested in call, odd...
  (br_if $loop
   (call $do-work)
  )
  (call $get-result) ;; value flows out through the loop
 )
)

LLVM wouldn't emit that, but another compiler might!

To help here Binaryen has a "flatten" pass:

(call $report
 (loop $loop (result i32)
  (br_if $loop
   (call $do-work)
  )
  (call $get-result)
 )
)
flatten
---→
(loop $loop
 (set_local $0
  (call $do-work)
 )
 (br_if $loop
  (get_local $0)
 )
 (set_local $1
  (call $get-result)
 )
)
(call $report
 (get_local $1)
)

Flat IR is simpler, for example, no nested side effects:

(call $report
 (loop $loop (result i32)
  (br_if $loop
   (call $do-work)
  )
  (call $get-result)
 )
)
flatten
---→
(loop $loop
 (set_local $0
  (call $do-work)
 )
 (br_if $loop
  (get_local $0)
 )
 (set_local $1
  (call $get-result)
 )
)
(call $report
 (get_local $1)
)

Binaryen Optimizer Design


  • Binaryen IR is almost identical to wasm, for effective minification
  • No separate "flat" and "full" IRs; flat IR is just a subset
  • A few passes depend on flat IR (when hard to write a general pass)
  • Most passes work on all IR — but may get a boost if flat

That boost takes Binaryen up to 4:


-O4 = flatten + flat-only passes + -O3

Fully optimizes a bunch of testcases we noticed in AssemblyScript

Not much large non-LLVM code yet to test on, but on fuzz testcases -O4 shrinks an extra 20%

But also 3x slower compilation — shows importance of option to run without flattening!


Binaryen now goes up to 4

Maybe someday we'll go up to 11


That's it, thank you!