Mozilla Dino Head Logo
Mozilla

WebAssembly

Nick Desaulniers - @LostOracle

Today's topics

  • What is it?
  • Why should I care?
  • Hasn't this been done before?
  • Does it allow us to run <favorite language> in the browser?
  • Why not use LLVM bytecode?
  • Where can I learn more?

...but first, a story about Instruction Set Architecture and processor design.

What is WASM?

A bytecode explicitly designed for portability, stability, compressability, security, and minimal nondeterminism.

Why should I care?

Loading large amounts of asm.js code or using a language other than JS in the browser.

Otherwise, WASM might not be of any interest to you, yet.

What's asm.js?

A type stable subset of JavaScript that can be compiled ahead of time by existing JS engines.

Spidermonkey Architecture
Luke Wagner

The key strength of the web is safe execution of remote code.

Hasn't this been done before?

alternative browser runtimes

All prior attempts taught us things like:

Sandboxing is a good idea.

We want something that is easily integrated across all browsers.

How is WASM different?

Start Up Performance

30% smaller than gzipped asm.js

Can be parsed 23x faster than large asm.js modules

View Source

Like LLVM IR, you'll be able to convert binary to text.

No need for unary plus and bitwise or operators (asm.js type hints).

Text can be shown in dev tools, source maps can show original source.

No text representation today, still working on necessary opcodes.

Working with DOM APIs

Not replacing them.

Can we run other languages in the browser?

List of languages that compile to JS

Translation vs runtime

translation vs runtime

Two approaches:

  • Translate one language, statement per statement, to JS.
    • Prone to correctness bugs.
  • Compile runtime from C/C++ to JS with emscripten.
    • Binary bloat, GC coordination, ABI.
binary bloat?

GC coordination

gc coordination

We want to dynamically create DOM elements from a managed language and have its GC be in charge of lifetime.

Do you want ants?

Who cleans up around here?

"Supporting multiple VMs is a big infrastructural challenge ... You'll need to make sure that the different GCs can coordinate ... Otherwise you will either not have an story of object reclamation, or you'll have memory leaks ... Doing this right is hard ... See [my paper and another which found] >5% throughput regression." ~ Filip Pizlo

What's an ABI?

a contract that two binaries compiled possibly from two different compilers at two different points in time can interoperate

ABI

Need to agree upon:

  • size of types (ILP32), padding of structs, alignment
  • calling convention (sending & retrieving values from function calls)
  • binary representation (Mach-O, ELF, PE32+)
  • function name mangling

Why not reuse LLVM bytecode?

llvm toolchain overview
rust plus llvm
llvm plus emscripten
llvm plus wasm

That's a lot of LLVM IR, let's just use that!

wasm llvm ir
portable has architecture specific opcodes
stable breaking changes between versions
small encoding not a design goal
fast decoding not a design goal
fast compiling not a design goal
minimal nondeterminism nondeterminism and undefined behavior

"... LLVM IR is a poor system for building a Platform ... LLVM isn't actually a virtual machine. It's widely acknoledged that the name 'LLVM' is a historical artifact ... LLVM IR is a compiler IR." ~ Dan Gohman

WASM is more of an ISA than a compiler IR.

Where can I learn more?

github.com/WebAssembly/design

Thanks!

Nick Desaulniers - @LostOracle