Assignment 4: Interpreter
Due: Wednesday, October 23, 2019 at 11:59pm
Submission cutoff: Saturday, October 26, 2019 at 11:59pm
If there’s one thing that a functional programming language is really good for, it’s writing more functional programming languages. It turns out that inductive data structures (syntax) and logic rules (semantics) have a fairly natural translation into a functional style. For example, think back to our arithmetic language from lecture 1.2:
We can translate the grammar into a series of recursive sum types:
type number = int
type binop = Plus  Minus  Times  Divide
type expr = Number of number  Binop of expr * binop * expr
(* "1 + 2 * 3" as an expression *)
let e : expr = Binop(Number 1, Plus, Binop(Number 2, Times, Number 3))
Values of the type expr
correspond to the abstract syntax tree of a program. The operational semantics then become a function that takes an expression, and output whether it steps or is a value.
type outcome = Step of expr  Val
let rec trystep (e : expr) : outcome =
match e with
 Number _ > Val
 Binop (el, binop, er) >
(match trystep el with
 Step el' > Step (Binop (el', binop, er))
 Val >
(match trystep er with
 Step er' > Step (Binop (el, binop, er'))
 Val >
(* This nonexhaustive match is provably safe by inversion on DNum *)
let (Number nl, Number nr) = (el, er) in
let op = match binop with
 Plus > ( + )
 Minus > (  )
 Times > ( * )
 Divide > ( / )
in
Step (Number (op nl nr))))
assert (trystep (Number 3) = Val);
assert (trystep (Binop (Number 1, Plus, Binop (Number 2, Times, Number 3)))
= Step (Binop (Number 1, Plus, Number 6)))
This example hopefully clarifies the distinction between and : the “plus” and “minus” symbols in the abstract syntax tree are distinct from the standard plus and minus operations in a theory of arithemtic, e.g. as embedded in the underlying OCaml semantics. Hence we need an explicit translation that maps
Plus > (+)
.
The sum of the syntax tree and the operational semantics defines an interpreter for the arithmetic language, or a program that can compute arithmetic programs. In building an interpreter, the learning goal is to deepen/check your understanding of PL theory by mechanizing it into OCaml. That is, we will turn operations like substitution and typechecking which we previously wrote by hand/Latex into executable algorithms.
In this assignment, you will implement an interpreter for the typed lambda calculus including every feature discussed in class so far, such as algebraic data types, recursive types, polymorphic types, and existential types.
Interpreter overview
In assign4/program
, we have provided you a skeleton of an interpreter. At a high level, the interpeter follows this pseudocode:
let filepath = "test.lam" in (* for example *)
let input : string = read_input filepath in
let e : Ast.Expr.t = Parser.parse input in
let tau : Ast.Type.t = Typecheck.typecheck e in
let e' : Ast.Expr.t = Interpreter.eval e in
Printf.printf "%s : %s > %s"
(Eval.to_string e) (Type.to_string tau) (Eval.to_string e')
Following the steps, in order:
 A lexer and parser, defined in
lexer.mll
andgrammar.mly
, transform an input string into a syntax tree, i.e. an expression of typeExpr.t
. The syntax tree for expressions and types is defined inast.mli
.  A typechecker, defined in
typecheck.ml
, computes the type of the expressione
.  An interpreter, defined in
interpret.ml
, evaluates the expressione
to a value.
We have already provided you a complete lexer and parser. At a highlevel, your goal is to implement the remaining parts of the interpreter: substitution, alphaequivalence, typechecking, and evaluation. Each component is explained in detail below, but I want to emphasize a methodology: do not build each interpreter component for the entire language in isolation.
Instead, you should build the interpreter in a testdriven, languagefeaturemajor fashion. For example, let’s say you want to start with the arithmetic subset of the typed lambda calculus.
 First, you should write a few test cases of programs that only use arithmetic, like
1
,1 + 2
,(1  3) * 4
, and so on.  Implement substitution, alphaequivalence, and typechecking for numbers and binary operators. Make sure that each of your examples typechecks if it should (or vice versa if it should not).
 Implement stepping for numbers and binary operators. Make sure that your examples step to the correct value and don’t enter a stuck state.
Once you’re satisfied that arithmetic works properly, then rinse and repeat for the remaining features. We recommend proceeding in this order:
 Arithmetic: Numbers, binary operators (already completed for you, just review the starter)
 Conditionals: Booleans, relational operators, AND/OR, if expressions
 Functions: lambdas, application, variables
 Product types: unit, pair, project
 Sum types: inject, case
 Fixpoints: fix
 Polymorphism: type functions and type application
 Recursive types: folding and unfolding
 Existential types: imports and exports
If you do not follow this methodology and then come to office hours asking generic questions like “why doesn’t my code work?”, we will likely tell you to simplify your development process.
Testing
In order to write test cases, you will need to know the concrete syntax of the typed lambda calculus language. For example, if you have a file test.lam
:
let n : num = 1 + 2 in
let t : num * num = (n, n + 1) in
t.L + t.R
You can build your interpreter by typing make
then running ./main.native test.lam v
to get:
Expr: ((fun (n : num) > ((fun (t : num * num) > t.L + t.R) (n, n + 1))) 1 + 2)
Type: num
Success: 7
We have also provided you a reference interpreter reference.sh
that you can run the same way as main.native
. If you are ever unsure about the semantics of a program, it should exactly match the implementation of the reference. If you think there’s a bug in the reference, please let us know!
Concrete syntax examples
Here are a few brief examples of the lambda calculus syntax. (Coincidentally, this entire program is a good holistic test case!)
(* Arithmetic, booleans, lets *)
let n : num = 1 + 2 in
let n : num = if (n > 1)  (n = 1) then n  1 else 0 in
(* Functions and pairs *)
let f : num > num = fun (n : num) > n + 1 in
let t : num * bool = (1, true) in
let n : num = (f (t.L)) in
(* Sums *)
let s : num + bool = inj false = R as num + bool in
let n : num = case s { L(n) > n + 1  R(b) > if b then 1 else 0 } in
(* Fixpoint *)
letrec fact : num > num = fun (n : num) >
if n = 0 then 1 else n * (fact (n  1))
in
let n : num = (fact 5) in
(* Polymorphic identity *)
let id : forall a . a > a = tyfun a > fun (x : a) > x in
let n : num = ((id [num]) 1) in
n
See the assign4/program/examples
directory for more complicated programs, particularly involving recursive and existential types. I’ve transcribed several of the examples from lecture into their corresponding text form.
A few things to note on writing lambda calculus programs:
 Our parser is not battletested, so it’s possible that the associativity or precedence may be counterintuitive. Always check the program parse for correctness. If something is off, you should be able to add parentheses until it works.
 Because you are not implementing type inference and our language does not support type aliases, you will have a lot of type annotations everywhere. This is sadly unavoidable.
 The
let
andletrec
constructs are not in the AST because they are desugared into function application and fixpoint as defined in lecture 3.1.
Test harness
Once you’ve written an example program, if you place it in the tests
directory, then you can run the script ./run_tests.py
to run your interpreter on each test and compare the result to a provided reference interpreter. Make sure your tests use the extension .lam
. By default, the test harness will output to your console, but if you wish to have it output to a file, you can use the outfile
option. There are other options too: for example, if you wish to run on only a subset of the tests, you can make a new directory containing only the tests you wish to run and change the test_dir
option. You can see a full listing of options through ./run_tests.py help
.
Note that the test harness requires to_debruijn
to be implemented correctly under ast_util.ml
to function correctly once you start supporting variables. See the “Alphaequivalence” section for details.
Debugging
To debug to your interpreter, the binary has a v
and vv
flag for verbose and extra verbose, respectively. The first flag will print out the parse and type of the input program, and the second flag will also print out a trace of the expression after each step.
If you want to add your own debugging information, you can use the verbose : unit > bool
and extra_verbose : unit > bool
functions to check if the flags are set.
Abstract syntax
The language’s abstract syntax is defined in ast.mli
. Briefly, a .mli
file (as opposed to .ml
) defines the interface for a module. So a reference in another file to Ast.something
can only reference something
if it exists in the .mli
interface.
For example, the Ast.Expr
module defines a type:
type binop = Add  Sub  Mul  Div
type t =
 Num of int
 Binop of {binop : binop; left : t; right : t}
...
For clarity, we use records instead of tuples for syntactic forms with multiple fields. For example, we can create and use expressions like:
let e : Expr.t = Binop {binop = Add; left = Num 1; right = Num 2} in
(* Mirror binary operators *)
match e with
 Num n > Num n
 Binop {binop; left; right} >
Binop {binop; left = right; right = left}
The algebraic data type mirrors the grammar notation we have used throughout the class. Below is our language’s complete grammar:
Typechecking
The first primary feature of the interpreter is the typechecker, which implements our typed lambda calculus’ static semantics. In typecheck.ml
, you will implement the function:
val typecheck_expr : Type.t String.Map.t > Expr.t > (Type.t, string) Result.t
typecheck_expr ctx e
takes a typing context and expression as input, returning a Ok(tau)
if the expression is is welltyped, and returning Err(error_string)
if it is not. The context ctx
corresponds to the term in the type rules, mapping variables (strings) to types.
Across the assignment, you will not be graded on the usefulness of your error messages, but we recommend good error messages as a debugging aid. We have provided a
Type.to_string
andExpr.to_string
that convert types and expressions to strings.
Example
We have already provided you with the typechecking rules for the number and binary operator cases. Reproduced below:
match e with
 Expr.Num _ > Ok Type.Num
 Expr.Binop {left; right; _} >
typecheck_expr ctx left >>= fun tau_left >
typecheck_expr ctx right >>= fun tau_right >
(match (tau_left, tau_right) with
 (Type.Num, Type.Num) > Ok Type.Num
 _ > Error ("informative error message"))
These rules correspond to the static semantic rules:
Carefully study the correspondence between the rules. Inductive conditions, for example, correspond to a recursive call to the typechecker. Note the use of the infix monadic bind operator, written as >>=
. Without it, the implementation looks like:
 Expr.Binop {left; right; _} >
(match typecheck_expr ctx left with
 Err s > Err s
 Ok tau_left >
(match typecheck_expr ctx right with
 Err s > Err s
 Ok tau_right >
(match (tau_left, tau_right) with
 (Type.Num, Type.Num) > Ok Type.Num
 _ > Error ("informative error message"))))
Notice the core structure of the logic. First, we typecheck the left side, and if the typecheck fails, then the overall check fails. Same thing for the right side, then finally check that the types are both numbers. This is a common pattern in code with sequences of operations, each of which can possibly fail.
The bind operator res >>= f
says: “if the result res
on left is Ok(tau)
, then return (f tau)
, else return the error without calling f
”. This allows us to linearize the code and avoid extra matches. We encourage you to adopt this pattern to avoid unnecessary boilerplate in your code.
Notes
A few more notes on implementing the typechecker:
 You should not check for type equality using the
=
operator. You should useAst_util.Type.aequiv
as detailed in the section on alphaequivalence below.  We have written a few simple tests at the bottom in
inline_tests
. You can uncommentlet () = inline_tests ()
to run them whenever you run the interpreter binary. You will find a similar pattern in each module. Feel free to write tests here, or use the externalrun_tests.py
harness.  The reference
typecheck_expr
implementation is about 200 lines long (with good error messages!).
Semantics
Below we reproduce the complete static semantics:
Evaluation
If an expression is welltyped, then it will be evaluated to a value using the eval
function in interpreter.ml
. eval
works by repeatedly computing trystep
on the current expression:
type outcome =
 Step of Expr.t
 Val
val trystep : Expr.t > outcome
Example
We have provided you all the Val
rules and the Binop
step rule for trystep
. For example:
match e with
 Expr.Binop {binop; left; right} >
(left, fun left' > Expr.Binop {left = left'; binop; right;}) > fun () >
(right, fun right' > Expr.Binop {right = right'; binop; left}) > fun () >
let (Expr.Num n1, Expr.Num n2) = (left, right) in
let f = match binop with
 Expr.Add > (+)
 Expr.Sub > ()
 Expr.Mul > ( * )
 Expr.Div > (/)
in
Step (Expr.Num (f n1 n2))
This expression says: first, step left
to a value. Then step right
to a value. Once they are both values, we can extract the inner numbers n1
and n2
, then combine then with the appropriate binary operation. Note that we can perform the nonexhaustive match on Expr.Num
because e
is welltyped, so we know that and by inversion, if then it must be a number.
The starter code uses a funky custom operator >
which is defined at the bottom of trystep
:
let rec trystep (e : Expr.t) : outcome =
...
and (>) ((e, hole) : Expr.t * (Expr.t > Expr.t)) (next : unit > outcome)
: outcome =
match (trystep e) with Step e' > Step (hole e')  Val > next ()
This operator captures the idea of reducing subexpressions to a value before proceeding. Specifically, if an expression e
steps to e'
, then it inserts e'
into a hole hole e'
, e.g. above:
fun left' > Expr.Binop {left = left'; binop; right}
Then if e
is a value, then operator calls next
. This contrasts with the longer and clunkier version:
 Expr.Binop {binop; left; right} >
(match trystep left with
 Step left' > Step (Expr.Binop {left = left'; binop; right})
 Val >
(match trystep right with
 Step right' > Step (Expr.Binop {right = right'; binop; left})
 Val >
let (Expr.Num n1, Expr.Num n2) = (left, right) in
let f = match binop with
 Expr.Add > (+)
 Expr.Sub > ()
 Expr.Mul > ( * )
 Expr.Div > (/)
in
Step (Expr.Num (f n1 n2) )
You do not have to use the >
, it’s solely for your convenience.
Notes
A few notes on trystep
:

It’s totally fine if you have nonexhaustive matches in your interpreter. Again, you can assume your expression is welltyped, meaning you can assume what kind of expression you’ll have at different points once it’s a value.

The reference solution of
trystep
is 90 lines of code. (Much less than the typechecker!)
Semantics
Below is the complete dynamic semantics for our language. It’s a lot of rules… but many of them are trivial values, or simple reduction orders. Hence the actual implementation is much shorter than for the typechecker.
Substitution
Once you are implementing typechecking and evaluation for functions/variables, you will need to have a substitution operation . In ast_util.ml
, you must implement substitution for both types and expressions, of the form:
val substitute : string > t > t > t
Here, t
is either Expr.t
or Type.t
, and substitute x e' e
maps to the form (in the case of expressions).
Recall that the main idea of substitution is replacing all free variables x
with the expression e'
in e
. The challenge in implementing substitution is that it must respect two properties:
 Variable shadowing: in lexical scoping, a variable is bound to its innermost identifier. For example, because the inner is bound to the lambda.
 Alphaequivalence: variables should not be substituted in a way that would accidentally bind an otherwise free variable. For example, where the inner argument is renamed to avoid a conflict.
Capturing both of these properties is actually quite nontrivial. One possibility is to use an abstract binding tree data structure that encodes lexically scoped variable semantics into the tree itself. Another is to use the semantics of the base language (OCaml) to encode the properties, i.e. higherorder abstract syntax.
We recommend this simple, slightly inelegant strategy:
 When substituting into a expression (or type), maintain a mapping from variables to expressions (or types). When performing , the mapping begins with .
 As you recurse through the expression, you may encounter a variable binding e.g. . Generate a “fresh” variable (one not used in the program) and add to the mapping. (We have provided you a such a
fresh
function.)  When you reach a variable , if is in the mapping, then replace with the corresponding value.
This strategy will eagerly rename every variable after each substitution, but it will ensure the two properties above. To get you started, we have provided a helper function substitute_map
that takes such a map as input.
Alphaequivalence
When typechecking, you will inevitably need to compute the equivalence of types. For example, checking function application will require comparing the argument’s type against the function’s argument type. However, consider the case where the types contain type variables, e.g. and . We would like to say that these are the same types, because values of these types can be used interchangeably—the choice of type variable name here has no semantic impact.
This idea is called alphaequivalence, or being equal up to a renaming of bound variables. For example:
You need to implement alphaequivalence for expressions and types in ast_util.ml
. The simplest way to implement alphaequivalence is to convert a given expression or type into a nameless form, i.e. using de Bruijn indices. The basic idea is to replace each variable name with a number representing the “depth” of its binding. For example, in the program:
let x : num = 1 in
let y : num + bool = inj 5 = L in num + bool in
case y {
L(z) > z + x
 R(z) > 9
}
Would translate into the nameless form:
let _ : num = 1 in
let _ : num + bool = inj 5 = L in num + bool in
case <0> {
L(_) > <0> + <2>
 R(_) > 9
}
When a variable is used, you count the number of bindings for any variable between the usage and the corresponding binding. For example, the distance from case y
to let y
is 0, while the distance from x
to let x
is 2 because of the z
and y
bindings.
We have already implement alpha equivalence for you in the aequiv
function. Your task is to implement the translation for expressions and types into de Bruijn indices in the to_debruijn
function:
val to_debruijn : t > t
Some notes:
 To replace a variable with its de Bruijn index, you can convert the index to a string, e.g.
Var(Int.to_string depth)
.  If an expression has free variables, you can leave them unchanged (i.e. do not replace with a number).
 All variable names should be replaced with the underscore
"_"
. InExpression.to_debruijn
, you should replace all types withAst.Type.Var "_"
.  The local test harness relies on your
to_debruijn
function to output expressions that it can compare with the reference solution in an alphaequivalent way. That means ifto_debruijn
function is incorrect, the test harness will not work as expected.
Submission and grading
For this assignment, our grading policy is that when you upload your assignment to Gradescope, you will receive your actual final grade for the submission. There are no private tests, but the Gradescope tests are still hidden, meaning you will only see coarse feedback (e.g. 8/10 correct for sum types). You should write tests both to check your code and also to understand the pragmatics of the typed lambda calculus in practice.
The full Gradescope test suite is under construction, but it will be completed by Friday evening.