CS 242: Final: Theorem Proving in Lean

In this course, I have presented you with very much a programmer’s view of programming languages. That is, understanding the value of theoretical computational constructs through the lens of practical application to the kinds of problems you’ll encounter in every-day (systems) programming tasks.

However, the origins of PL theory came much more so from the domain of math and logic. Alonzo Church, inventor of the lambda calculus, was a logician (computer science did not exist in his time). Some of the earliest applications of computing hardware were directed towards solving logic problems, like the Logic Theorist from 1950. Since then, the fields of automated theorem proving and interactive theorem proving has only grown in scope and substance. Today, we have dozens of tools and programming languages dedicated to the construction and verification of mathematical proofs, such as Coq, Isabelle, and Lean.

For this final, you will learn to use the Lean proof assistant in order to automate proofs of progress and totality for the arithmetic language we discussed in the second lecture. The learning goals are:

Learn about the fundamental relationship between functional programs and constructive logic proofs (i.e. the Curry-Howard correspondence).
Learn how proofs can be completely formalized from English into code.
Learn to use a programming language with an advanced type system by drawing on concepts from this course.

The output for this final will be a series of proofs. If your proof typechecks, it is guaranteed to be correct, and you will receive full credit for the proof. There are no hidden tests or written components. We will not provide any partial credit for proofs that do not typecheck.

Lean

For this final, we have chosen to use the Lean theorem prover, primarily because it has a great tutorial, and I found it pretty usable in practice.

Setup

Follow the instructions to download Lean. You should be able to run lean --version from the command line.
Setup editor integration for a supported editor. You MUST use a supported editor. You WILL NOT be able to complete this assignment otherwise.

Learning Lean

The big challenge of this final will be simultaneously trying to construct the proofs while learning the language. Self-directed learning is about maximizing productivity while in a state of semi-understanding. You will be overwhelmed and bewildered while you wade through an ocean of new syntax, foreign type rules, and inscrutable errors with no StackOverflow to help you. This is, in many ways, a suitable capstone experience for this course.

There are no explicit restrictions on how you approach the proofs and in what order. However, we encourage you to do a few exercises to incrementally build familiarity with the language before tackling the boss fight. We recommend at least problems 1, 4, and 7 from last year’s final.

In order to complete the proofs on the final, you will need to read at most the following sections from the Lean tutorial:

Section 2.1-2.4, 2.8
Section 3.1-3.4
Section 4.1, 4.2, 4.4
Section 5
Section 7.1-7.6

Arithmetic language

In the second lecture, we discussed a simple arithmetic language consisting of numbers and binary operators. Your task is to recreate our English proofs from that lecture as Lean code.

Language definition

In final/lean/program/ we have provided the starter for code this assignment. arith.lean contains the abstract syntax and operational semantics for the language, reproduced below.

inductive Binop | Add | Sub | Mul | Div

inductive Expr
| Num : ℕ → Expr
| Binop : Binop → Expr → Expr → Expr

inductive val : Expr → Prop
| DNum {n : ℕ} : val (Expr.Num n)

inductive steps : Expr → Expr → Prop
| DLeft (op : Binop) (el el' er : Expr) :
  steps el el' →
  steps
    (Expr.Binop op el er)
    (Expr.Binop op el' er)
| DRight (op : Binop) (el er er' : Expr) :
  steps er er' → val el →
  steps
    (Expr.Binop op el er)
    (Expr.Binop op el er')
| DOp (op: Binop) (nl nr : ℕ) :
  steps
    (Expr.Binop op (Expr.Num nl) (Expr.Num nr))
    (Expr.Num (apply_binop op nl nr))

First, we use the inductive keyword to define the syntax (grammar) as an inductive structure. Here are some examples of creating expressions:

example : Expr := Expr.Num 1
example : Expr := Expr.Binop Binop.Add (Expr.Num 1) (Expr.Num 2)

Then val and steps define the value and small-step judgments, respectively. We define each judgment as an inductive structure so we can perform inductive proofs on the judgments. The type of val is Expr → Prop, meaning val is a function that takes an expression and returns a proposition (i.e. a term that represents the proof that an expression is a value). Then, we translate each rule into the corresponding branch of the inductive structure. Consider these rules:

$\ir{D-Num} {} {\val{n}} \s \ir{D-Left} {\steps{e_L}{e_L'}} {\steps{e_L \oplus e_R}{e_L' \oplus e_R}}$

For each variable implicitly quantified in the rule, e.g. $n$ in D-Num and $e_L, e_L', e_R$ in D-Left, we add parameters to the rule (e.g. (op : Binop) (el el' er : Expr)). Then the implication is written as an arrow, e.g. the D-Left rule translates to:

steps el el' →
steps
  (Expr.Binop op el er)
  (Expr.Binop op el' er)

Because D-Num is unconditional, it does not have an implication.

Inversion (10%)

To warm up with the arithmetic language, your first task is to write a proof of a simple inversion lemma. In this language, for an expression $e$ , inversion says that if $\val{e}$ , then $e = n$ for some number $n$ . In Lean, we write this as:

lemma val_inversion : ∀ e : Expr, val e → (∃ n : ℕ, e = Expr.Num n)

The reference solution is 4 lines of code. I encourage you to write this using both terms and tactics to get a feel for both approaches.

Your final proof must not contains any sorry expressions or it will be immediately rejected.

Progress (40%)

Next, you are going to prove progress for arithmetic in progress.lean. Progress says that for all expressions $e$ , either $\val{e}$ or there exists an $e'$ such that $\steps{e}{e'}$ . Again we can formally write this in Lean as:

theorem progress : ∀ e : Expr, val e ∨ (∃ e', e ↦ e')

You will need to write this and all future proofs in tactic mode. (Not as a strict requirement, but I doubt you’ll be able to finish otherwise.)

The structure of this proof should be comparable to doing it by hand. So you should start by introducing the terms (intros), inducting on the variable of interest (induction e), then proceeding by cases (case Expr.Num { ... }). The reference solution is about 30 lines of code, although my first successful attempt was about 45 lines of code.

A few other tips:

As you’re developing your proof, continually make use of the proof context. Always be clear on what the current goal is and what facts are in scope.
Use show to change the goal, or have to introduce a new fact without changing the goal.
Use exact to provide a term that has the type of the goal, and assumption to find a term in the proof context for the goal.
Don’t be afraid to use the powerful simp tactic. For example, simp [*, val.DNat] attempts to apply every fact in context plus the val.DNat theorem to prove the current goal. This can save you a lot of work if it works!

Evaluation (10%)

In evals.lean, we have provided the following definition for the “ $\overset{*}{\mapsto}$ ” operator:

inductive evals : Expr → Expr → Prop
| Val (e : Expr) : evals e e
| Step (e e_step e_val: Expr) :
  (e ↦ e_step) → evals e_step e_val → evals e e_val

Formally, this defines the star-step relation (or “evals” operator) as the reflexive transitive closure of the small-step relation. That is, $\evals{e}{e}$ (reflexivity) and $\steps{e}{e'} \wedge \evals{e'}{e''} \implies \evals{e}{e''}$ (transitivity).

As detailed in Section 5.2, Lean has built-in tactics for dealing with reflexive and transitive relations. We can register evals with these tactics by proving the two properties:

@[refl]
lemma evals_refl (e : Expr) : e ↦* e := evals.Val e

@[trans]
lemma evals_trans (e1 e2 e3 : Expr) (h1 : e1 ↦* e2) (h2 : e2 ↦* e3)
  : e1 ↦* e3 :=
  begin
    induction h1,
    case evals.Val { assumption },
    case evals.Step {
      apply evals.Step,
      show Expr, from h1_e_step,
      assumption,
      exact h1_ih h2
    }
  end

To understand the transitivity proof, you should open the goal window and walk through the goal at each step of the proof. Once you understand the proof, your next task is to implement the transitive_left and transitive_right lemmas that relate evaluations of subexpressions to evaluates of whole expressions.

Specifically, transitive_left says that if $\evals{e_L}{e_L'}$ then $\evals{e_L \oplus e_R}{e_L' \oplus e_R}$ . In Lean, this is written as:

lemma transitive_left (b : Binop) (el el' er : Expr) :
  (el ↦* el') → (Expr.Binop b el er ↦* Expr.Binop b el' er)

Similarly, the transitive_right lemma mirrors the D-Right rule:

lemma transitive_right (b : Binop) (el er er' : Expr) :
  val el → (er ↦* er') → (Expr.Binop b el er ↦* Expr.Binop b el er')

Fill in the proofs for these two lemmas. The reference solution is about 10 lines of code per theorem. Once you get transitive_left done, then transitive_right should look almost exactly the same. You will find it useful to use the reflexivity and transitivity tactics.

Totality (40%)

Finally, with these lemmas in hand, we can prove the actual theorem we discussed in lecture: totality. Totality says that all expressions reduce to a value, i.e. for all expressions $e$ , there exists an $e'$ such that $\evals{e}{e'}$ and $\val{e'}$ . In Lean:

theorem totality : ∀ e : Expr, ∃ e' : Expr, val e' ∧ e ↦* e'

The reference solution is about 35 lines of code, although my initial attempt was closer to 60 lines.

Extra credit:

Each extra credit will earn you up to 15% bonus points. You cannot do both extensions to get 30% (unless you are a group of 3, then you must do at least one, and can do both to get 15% total extra credit).

Option 1: Proof golf

Any mathematician can appreciate the beauty of a small proof, just as a programmer may love a small program. Your task is to make your proofs of progress and totality as small as possible.

Your proof size will be measured by number of tokens, as computed by the Lean lexer. For example, apply (exists.elim h), would be six tokens, for ["apply", "(", "exists.elim", "h", ")", ","]. Spaces are not considered in tokenization.

To compute the number of tokens in a proof, you will need to install our fork of the Lean compiler. You will need CMake installed.

git clone https://github.com/stanford-cs242/lean
cd lean
mkdir -p build
cd build
cmake ../src
make

Then, to evaluate the size of a proof, copy the text between the begin and end of the relevant theorem to a new text file. Then inside the lean/build directory, run:

./shell/lean_token_count your_file.lean

And it will output the number of tokens. The script is a little fragile, so if it fails to tokenize please file a bug report on Campuswire.

Our proof of progress is 113 tokens, and totality is 158 tokens. You cannot define any additional lemmas, theorems, definitions, and so on outside the proof for this extra credit.

You can get extra credit by the following rubric:

Progress: <80 5%, <100 4%, <120 3%, <140 2%, <160 1%
Totality: <100 10%, <130 8%, <160 6%, <180 4%, <200 2%

Option 2: Variables

So far, I’ve done most of the hard work in mapping our math notation into Lean code to define steps, val, and so on. For this extra credit, you are going to take over that job by extending the arithmetic language with variables. Specifically, we will add to the syntax let bindings and variables:

$\begin{alignat*}{2} \msf{Expression}~e ::= \ldots \\ \qamp \msf{let}~e_\msf{var}~\msf{in}~e_\msf{body} \qqamp \text{Let binding} \\ \qamp \bar{i} \qqamp \text{Variable usage (de Bruijn indexed)} \end{alignat*}$

For example, the following is a valid arithmetic expression:

$1 + (\msf{let}~4~\msf{in}~\msf{let}~2~\msf{in}~\bar{1}-\bar{0})$

We use the bars here to distinguish numeric constants from de Bruijn indexed variables. This evalutes to $(1 + 4 - 2) = -1$ . Formally, we will use substitution semantics:

$\ir{D-Let} {} {\steps{\msf{let}~e_\msf{var}~\msf{in}~e_\msf{body}}{\subst{\bar{0}}{e_\msf{var}}{e_\msf{body}}}}$

Once we add variables, our old totality theorem no longer applies. The expression $1 + \bar{0}$ has a free variable, so it will get stuck. We need a type system to check for this issue.

$\ir{T-Num} {} {\ctx \vdash \validj{n}} \s \ir{T-Binop} {\ctx \vdash \validj{e_L} \s \ctx \vdash \validj{e_R}} {\ctx \vdash \validj{e_L \oplus e_R}} \nl \ir{T-Let} {\ctx \vdash \validj{e_\msf{var}} \s \ctx + 1 \vdash \validj{e_\msf{body}}} {\ctx \vdash \validj{\msf{let}~e_\msf{var}~\msf{in}~e_\msf{body}}} \s \ir{T-Var} {\bar{i} < \ctx} {\ctx \vdash \validj{\bar{i}}}$

Here, the context $\ctx$ represents the number of bindings active. With this judgment in hand, we can now write theorems about arithmetic with variables. For example, preservation:

Preservation: For all expressions $e$ , if $\steps{e}{e'}$ and $\nul \vdash \validj{e}$ then $\nul \vdash \validj{e'}$ .

For this extra credit, you should do the following, in order in the arith_var.lean file:

Add the new syntax to Expr.
Add the substitution rules for let and var to the subst judgment.
Add the operational semantics for let to the steps judgment.
Add the typing rules for let and var to the valid judgment.
Prove the preservation theorem.

You can prove preservation by any means necessary. Feel free to define any lemmas outside the main proof body. As a hint, we have given you the definition of a subst_preserves_valid theorem which says that if an expression is valid, then substituting into that expression preserves its validity under the correct conditions.

If your steps 1-4 are appropriate (faithful to the description above) and the preservation proof is correct, you will receive 15% extra credit. We will inspect your solution and remove some or all of the points if we feel you implemented the language extension incorrectly.

Submission

To submit your final, run make_submission.sh and upload the zip file to Gradescope. The autograder is running Lean 3.4.2, so make sure to test your submission against the autograder if your computer is running a different version. If your autograded results are inconsistent with your local testing, please post a note on Campuswire.

Sections