CS 242: Assignment 5: WebAssembly

Due: Wednesday, October 30, 2019 at 11:59pm
Submission cutoff: Saturday, November 2, 2019 at 11:59pm

In this assignment, you will get exposure to the theory and practice of WebAssembly. You’ll write a WebAssembly program, design new WASM features, and reason about its security properties.

1: x86 to WASM (35%)

Below is a mystery function implemented in x86 assembly. This is a sanitized, semi-human-interpretable version of the gcc -O0 -S output for a corresponding C implementation of the mystery function. Your task is to translate this function into an equivalent WebAssembly program. The learning goal is to think about:

What is the difference between a stack model (WASM) and register model (x86) of computation?
What is the difference between structured control flow (WASM) and GOTO control flow (x86)?

  .globl  _mystery
_mystery:
  movl $0, -4(%rsp)
WHILE:
  addl $1, -4(%rsp)
  cmpl  $1, %edi
  je  ENDWHILE
IF:
  movl  %edi, %eax
  cltd
  movl  $2, %ecx
  idivl %ecx
  cmpl  $0, %edx
  jne ELSE
THEN:
  movl  %edi, %eax
  cltd
  movl  $2, %ecx
  idivl %ecx
  movl  %eax, %edi
  jmp ENDIF
ELSE:
  imull $3, %edi, %edi
  addl  $1, %edi
ENDIF:
  jmp WHILE
ENDWHILE:
  movl -4(%rsp), %eax
  retq

A few notes to refresh you on reading x86 assembly.

In the C calling convention, %edi is the first argument to a function.
Moves go from left to right, so e.g. movl %edi, %eax changes the value of %eax to be the value of %edi.
The idiv instruction divides the register edx:eax by the operand. The quotient is stored in eax and the remainder is stored in edx.
cldt is an alias for the cdq instruction, which sign extends eax into edx:eax (in preparation for a divide).
je means “jump if equal” and jne means “jump if not equal”.
The return value of a function lives in the %eax register.

We have provided you the assembly source file and a small C test harness in assign5/program/asm if you want to test out this function, or inspect it in GDB.

You should implement your WASM version in assign5/program/wasm/src/mystery.wat. We have provided you a Javascript test harness for running and testing your WASM implementation in the browser. To get started:

Install Node.js, either through a package manager or a direct download.
In the assign5/program/wasm directory, run npm install once to set up your dependencies.
In the same directory, run npm start to start a local web server. It should open a web page in your browser that contains the mystery function tests.

If you can pass all the tests in the browser, your code is probably fully correct.

Your main resources for writing WebAssembly are:

WebAssembly Specification: this document contains the most up-to-date semantics for all instructions in the language
Understanding WebAssembly text format: a tutorial on writing WebAssembly in the S-expression form
WebAssembly test suite: a large number of WebAssembly source files used to test WASM imlementations

Editor setup

You can find packages for WebAssembly syntax highlighting below:

Atom - https://atom.io/packages/language-webassembly
Sublime - https://github.com/bathos/wast-sublime-syntax
Vim - https://github.com/rhysd/vim-wasm
Emacs - https://github.com/devonsparks/wat-mode

Debugging WebAssembly

If you want to interactively debug your WebAssembly, the browser comes with a few native tools for debugging a running WASM program.

Chrome (recommended)

Open Developer Tools (View → Developer → Developer Tools).
Navigate to the Sources tab.
Refresh.
Find the code under wasm in the side bar. Each function is separated into its own source unit with a name like wasm-42020a46-0. There should only be one function to click on.
Like in GDB, you can set breakpoints that flag the debugger to suspend execution when hitting a certain instruction. To set a breakpoint, click the line number of the instruction.
Refresh the page to run the code and stop at that breakpoint.

You should see a debugging view like this with source on the left. The source view shows the progrma source with variable names stripped (all de Bruijn indices). On the right, you can see the current value of the global memory, the local variables, and the stack.
Using the buttons in the top right, you can continue execution, step into the next function call, and so on.

Firefox

Firefox doesn’t seem to show either the stack values or memory, but you can still use the debugger to step through and watch local variables change.

Open Developer Tools.
Navigate to the Debugger tab.
Refresh.
Find the WASM code under wasm:// in the side bar.
Set a breakpoint.
Refresh to run the code and stop at that breakpoint.

2: WASM extensions (45%)

In this series of problems, you will define the static and dynamic semantics for a variety of language features common to imperative languages. The goal is to extrapolate the semantic style we discussed in lecture to new concepts.

For each problem, you should define some number of typing rules and step rules in a way that captures the English specification, but does not violate progress and preservation. You do not need to provide an actual progress and preservation proof, but you should think in those terms when deciding whether your rules are sufficient.

Put each answer in problem 2 of the written portion in assign5/written/solution.tex.

2.1: `if` (15%)

First, we’re going to add an if statement to our language. The structure of the if is:

$\begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wif{e_\msf{then}^*}{e_\msf{else}^*} \\ \end{alignat*}$

For example, in the program:

(i32.const 0)
((if
  (i32.const 2))
 (else
  (i32.const 3)))

The value at the top of the stack is the implicit condition for the if (here 0, representing false). This program would evaluate to $\{\wcs{:}~3\}$ . In general, your if should follow this specification:

An if must have an i32 at the top of the stack, which it uses to determine whether to enter the first or second branch.

Unlike a label, if statements do not have their own local stack, meaning each branch should affect the stack in the same way. Here a few examples of programs to demonstrate the point:

;; This program is VALID, since the effects of the if are externally visible
(i32.const 4)
(i32.const 1)
((if (i32.const 2)) (else (i32.const 3)))
(i32.add) ;; evalutes to 6

;; This program is INVALID, because the branches have different return types
(i32.const 1)
((if (i32.const 0)) (else))

2.2: `for` (15%)

Next, we will add the venerable for loop. Like in every language, our for loop consists of an initialization step, a termination condition, and a loop post-processing step.

$\begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wfor{e_\msf{init}^*}{e_\msf{cond}^*}{e_\msf{post}^*}{e_\msf{body}^*} \end{alignat*}$

For example, this loop computes the sum of numbers from 1 to n.

(func $sum (param $n i32) (local $i i32) (local $count i32) (result i32)
  (for
    (init
      (i32.const 0) (set_local $count)
      (i32.const 1) (set_local $i))
    (cond
      (get_local $i)
      (get_local $n)
      (i32.gt))
    (post
      (get_local $i)
      (i32.const 1)
      (i32.add)
      (set_local $i))

    (get_local $count)
    (get_local $i)
    (i32.add)
    (set_local $count))

  (get_local $count))

The semantics of a for-loop are:

The loop instruction takes no inputs and produces no outputs, meaning it cannot affect the stack surrounding the loop.
The initialization $e_\msf{init}^*$ takes no inputs and produces no outputs. It should run once before the loop begins.
The condition $e_\msf{cond}^*$ $ takes no inputs and produces a single boolean (i32). At the start of the loop, if the condition is false, the loop should exit.
The body $e_\msf{body}^*$ runs after the condition. It takes no inputs and produces no outputs.
The post-processing $e_\msf{post}^*$ runs after the body, and also takes no inputs and produces no outputs.
It is up to you how to treat branches inside the body of a for-loop (so long as it satisfies P&P). You could, for example, say that:
- A branch acts like a continue, going to the next iteration of the loop.
- A branch acts like a break, exiting the loop.
- A for-loop has no branch semantics, and br branches to outside the loop.
- A for-loop disallows branches in the body.

2.3: `try`/`except` (15%)

Next, we will look at a particularly exciting kind of control flow: exceptions. Briefly, let’s review how labels and branches work in WASM. At compile time, block and loop instructions define a series of nested scopes, and a br instruction identifies a specific block to jump to. For example, in the program:

(block $a
  (block $a
    (br $a)))

;; or equivalently...
(block
  (block
    (br 0)))

The br breaks out of the first block, not the second. You should see this as lexically-scoped stack-jumping control flow. Meaning, this is a specific kind of control-flow that flows from the bottom of a stack to somewhere in the middle. And the location of where to jump is determined entirely at compile-time.

With that context in mind, we can see an exception as dynamically-scoped stack-jumping control flow. Meaning, instead of br we write raise, and where we jump to is determined by the runtime stack instead of a compile-time stack. Whoah! So for example, if we add these to our grammar:

$\begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wtry{\star{e_\msf{try}}}{\star{e_\msf{raise}}} \qqamp \text{Try block} \\ \mid \qamp \wraise \qqamp \text{Exception raise} \end{alignat*}$

Then we can use an exception like this:

(module
  (func $main (local $result i32) (result i32)
    ((try
       (i32.const 1)
       (i32.const 0)
       (call $try_div)
       (set_local $result)
     (catch
       (set_local $result))))
    (get_local $result))

  (func $try_div (param $m i32) (param $n i32) (result i32)
    (get_local $n)
    (i32.const 0)
    (i32.eq)
    ((if
      (i32.const -1)
      (raise))
     (else
      (get_local $m)
      (get_local $n)
      (i32.div_s)))))

This snippet has a try_div function that raises an exception carrying the error number -1 if n == 0, returning m / n otherwise. Then in main, we can try the division, and catch the error if it fails. This is dynamically-scoped control flow because where the raise returns to is dependent on which try/catch instructions are on the runtime stack. The raise is clearly not in scope of the try/catch in main.

More formally, the behavior of try/catch should be:

A try block cannot affect the stack surrounding the try. $e_\msf{try}^*$ should take no arguments and return no values. $e_\msf{raise}^*$ should take one argument (the i32 error code with the raise) and return no values.
You are free to introduce additional administrative instructions if necessary.
If $e_\msf{try}^*$ finishes with no exceptions, then the entire try instruction should be removed.
A raise should take the top value on the stack as an error code, and exit all scopes (labels and frames) until reaching a try. Then the $e_\msf{raise}^*$ should be executed with the error code on top of the stack.
Branches and returns should propagate across try boundaries.
If a raise is never caught by a try, then execution should halt (the program has reached a value). We didn’t cover trap semantics closely in lecture, so we’ll provide this rule as a freebie:

$\ir{D-Raise-Val} {} {\val{\{\wci{:}~\wraise\}}}$

3: Control flow integrity (20%)

A common problem with languages like x86 assembly is security vulnerabilities. Because x86 permits arbitrary jumps, languages (like C) which use x86 and permit arbitrary memory access are prone to stack smashing and return-oriented programming. These kinds of control-flow vulnerabilities relate to a program’s control flow integrity: can its control-flow be changed from the expected control paths at the time of compilation? WebAssembly was specifically designed so some classes of memory and control flow attacks would be avoided (see the WASM Security document). In this problem, you will explore what we can learn about WASM’s safety from its formal semantics.

WASM’s main design decision for security was to separate the data stack from the control stack, e.g. a function’s return pointer does not exist in the same address space as local variables or the heap. Because WASM satisfies progress and preservation, it is impossible for an attacker to arbitrarily manipulate the value stack of a well-typed program, since that could cause undefined behavior. However, when C is compiled to WASM, buffer overflows are still possible, as our type safety theorem ensures nothing about the state of memory.

Formally, we can encode this idea into a threat model, or a situation describing the powers of an attacker. Let $m$ be a module, $\ctx = \{\wcf{:}~m.\wcf\}$ , and $C$ a configuration such that $\typeJC{C}{\tau^*}$ . Assume that $m$ has at least two functions, and function 0 is somehow private/secure/etc., meaning it should not be called. Assume that the instruction $(\wcall{0})$ does not appear anywhere in $C.\wci$ (including nested instructions). In our threat model, the attacker can arbitrarily change to memory. The attacker can generate a new configuration $C' = \{C~\with~\wcm{:}~n_\wcm'^*\}$ for any $n_\wcm'^*$ .

Under this threat model, we want to answer two questions:

Can $C'$ have undefined behavior? Formally, if $C'$ does not have undefined behavior, then we can prove $\typeJC{C'}{\tau^*}$ .
Can $C'$ call the private function? At any point during the execution of $C'$ , is it possible that function 0 is called?

3.1: Standard model (10%)

First, answer these two questions for the WASM subset presented in lecture. Your response does not need to be a formal proof, only a yes/no with an informal justification for your answer (max 1 paragraph per question). If you believe the answer is yes (has undefined behavior, or can call a private function), then you should provide a concrete WASM program along with a memory manipulation that would cause the undefined behavior/private function call.

3.2: Indirect function calls (10%)

Next, consider extending our WebAssembly subset with a new instruction call_indirect. An indirect function call is an instruction that decides which function to execute based on a value on the stack. Specifically, we add:

$\begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \windirect{\rho} \qqamp \text{Indirect function call} \\ \end{alignat*}$

And it has the following semantics:

$\ir{T-Indirect} {} {\typeJC{\windirect{(\tfun{\tau_1^*}{\tau_2})}}{\tfun{\tau_1^*, \msf{i32}}{\tau_2}}} \nl \ir{D-Indirect-Ok} {C.\wcmod.\wcf[i].\msf{params} = \tau_1^* \s C.\wcmod.\wcf[i].\msf{return} = \tau_2} {\wsteps {C~\with~\wci{:}~\windirect{(\tfun{\tau_1^*}{\tau_2})};~\wcs{:}~i} {C~\with~\wci{:}~\wcall{i}}} \nl \ir{D-Indirect-Fail-1} {C.\wcmod.\wcf[i].\msf{params} \neq \tau_1^*} {\wsteps {C~\with~\wci{:}~\windirect{(\tfun{\tau_1^*}{\tau_2})};~\wcs{:}~i} {C~\with~\wci{:}~\wtrapping}} \nl \ir{D-Indirect-Fail-2} {C.\wcmod.\wcf[i].\msf{return} \neq \tau_2} {\wsteps {C~\with~\wci{:}~\windirect{(\tfun{\tau_1^*}{\tau_2})};~\wcs{:}~i} {C~\with~\wci{:}~\wtrapping}}$

An indirect call instruction carries the expected type signature of the function to call. At compile time (T-Indirect), we assume that the indirect call has the same type as the provided function type, along with the i32 to indicate which function is being called. Then at run-time, we check that the requested function $i$ has the corresponding type, returning a trap if not. For example, here’s a valid WASM program in concrete syntax:

(module
  (func $a (param $x i32) (result i32)
    (get_local $x)
    (i32.const 1)
    (i32.add))

  (func $b (result i32)
    (i32.const 2)
    (i32.const 0)
    (call_indirect (param i32) (result i32))

The call_indirect is parameterized by a type $\tfun{\msf{i32}}{\msf{i32}}$ . To use the instruction, we give the 2 as the argument to the indirect function (parameter $x in function $a), and give 0 to indicate which function is being called (referring to $a).

Given our WASM language now contains indirect function calls, you should answer the two questions above: can an arbitrary memory change cause undefined behavior? And can the private function be invoked? Again, provide a short justification and sample attack if possible.

A note on the latter question: you should assume that before the memory is attacked, i.e. during the evaluation of the “normal” program $C$ , that $\wcall{0}$ would never appear at any point in the instruction stream, even due to an call_indirect.

Sections