A hands-on Introduction to Program Verification in Lean 4

Introduction

This tutorial is meant to introduce you to using Lean 4 for verifying short programs. It is assumed you have a basic knowledge of functional programming and algebraic types. It will also be useful to have seen some basic Lean 4 proofs before reading this, although not necessary. Consider going through chapters 1 and 2 of Mathematics in Lean [3]. Some simplifications might be made to appeal to intuition, further reading is recommended at the end.

It is strongly advised that you attempt all exercises before moving to the next section.

Acknowledgements

This document was written in the context of planning a software verification course at Dartmouth College. Thank you to Professor Sergey Bratus for his guidance and support.

A more thorough list of sources is available at the end, but a big inspiration for this tutorial and a lot of the examples was Functional Programming in Lean. This tutorial is meant as a quicker introduction with less focus on actual development, but this is a great resource for a more thorough path to learning Lean as a programming language.

Writing functions in Lean 4

We will start by writing some simple functions in Lean 4 we can later analyze. For our first example, we will write a function that, given a natural number $n$, generates a list counting down from $n$ to $1$. Notice how this has a similar structure to other functional languages:

def count_from (n : Nat) : List Nat :=
  match n with
  | 0 => []
  | k + 1 => n :: count_from k

Note: We can try some examples by writing #eval count_from 10 (or any other number), and placing our cursor at the end of the line. In most Lean IDEs, the pane on the right, which shows the state, will show the result.

Let’s go through it line by line:

def count_from (n : Nat) : List Nat :=

We define a function with a single parameter, $n$, of type Nat. This type is defined in the Lean prelude as so:

inductive Nat where
  | zero : Nat
  | succ (n : Nat) : Nat

As the first keyword indicates, this is an inductive type. This means it is a sum type that is also recursive: we define it by two constructors, which represent different “cases” (hence, a sum type). They follow the Peano definition of the natural numbers (starting from zero), a “base case” of zero, while the other naturals are defined as successors of a natural number (hence, the recursivity). The inductive keyword also gives us a clue as to how we are going to prove facts about inductive types.

The function returns a List Nat. If we look at the definition of List, we see it is also defined inductively:

inductive List (α : Type) where
  | nil : List α
  | cons (head : α) (tail : List α) : List α

This time, it takes a type parameter $\alpha$, and provides two constructors, nil for the empty list, and cons, which takes a head of our type $\alpha$, and another List α as a tail. Thus, in our case, we pass Nat as the type parameter to represent a list of natural numbers.

Moving to the next line,

match n with

We want to match our number $n$ against different cases, as is common in recursive functions. In particular, we know that the different cases correspond exactly to the constructors in the type definition (and Lean will check that we cover them). As the Nat type is very commonly used, we have some syntactic sugar to make our code nicer: we use the literal 0 instead of Nat.zero, and k + 1 instead of Nat.succ k. Additionally, the List type has [] for List.nil, head :: tail for List.cons head tail, and [x, y, z, ...] for x :: (y :: (z :: ...))). We also use xs ++ ys to return the result of appending ys at the end of xs.

The first case is 0, which returns the empty list:

| 0 => []

(or without the syntactic sugar, Nat.zero => List.nil)

The second case is the recursive case, where $n = k + 1$ (or equivalently, Nat.succ k) for some natural $k$:

| k + 1 => (k + 1) :: count_from k

Here, we return n prepended to count_from k. As $n = k + 1$, this means we use n as the head and [n - 1, n - 2, ..., 1] as the tail, and finishes our function definition.

Note: Lean also requires that all recursive functions terminate. It’s able to show it automatically in a lot of cases (like this one), but as the general problem is not computable, some cases require the user to provide a measure. See the further readings [1] for information on this.

Aside: Implicit parameters for types

Suppose we want to write a list reverse function. We might initially write it as so [1]:

def reverse (l : List Nat) : List Nat :=
  match l with
  | [] => []
  | x :: xs => reverse xs ++ [x]

This is fine, but it only works for lists of Nats. If we want to expand this to other types of lists, we might first try to pass a type parameter as so:

def reverse (α : Type) (l : List α) : List α :=
  match l with
  | [] => []
  | x :: xs => reverse α xs ++ [x]

This is better, but it requires us to pass the type any time we want to run the function (e.g. #eval reverse Nat [1, 2, 3]). A nicer way to do it is to use an implicit parameter:

def reverse {α : Type} (l : List α) : List α :=
  match l with
  | [] => []
  | x :: xs => reverse xs ++ [x]

By using curly braces instead of parenthesis, this tells Lean that $\alpha$ is an implicit parameter, meaning it will infer it automatically when used (by checking the type of the list l). We can now use it as so: #eval reverse [1, 2, 3]

Exercises

Write a (recursive) function sum_of_first_n, which takes a Nat $n$, and sums the naturals from $1$ to $n$, inclusive. Don’t use the closed formula!
Write a my_length function, which takes a List and returns its length
Write a my_append function, which takes two Lists and appends them
(harder) Binary Trees [1]:
1. Write a BinTree inductive type, with a type parameter $\alpha$, that represents a binary tree with elements of type $\alpha$.
2. Additionally, write BinTree.mirror and BinTree.count functions which mirror the tree (swaps left and right branches) and counts the number of leaves, respectively (we will not go into detail, but this dot notation is similar to methods in OOP).

Verifying properties about functions

The next step, after writing our functions, will be to show that these are (in some sense of the word) correct. Ideally, we have some formal specification that we can show the function is equivalent to, but we will start by showing some properties about the functions.

For our first example, we will show that calling count_from n returns a list of length n. For this, we will state and prove a theorem that says exactly that:

open List (length)

theorem count_from_num (n : Nat) : length (count_from n) = n := by sorry

Before the theorem is stated, open List (length) “imports” the length function from the List namespace (this is so we can type length instead of List.length)

Reading it part by part, the first two words, theorem count_from_num, say that we will create a new theorem called count_from_num. This name is useful when using this theorem as an intermediate result in another proof. Next, we have (n : Nat). Roughly, we can read it as “given a natural number $n$, …”. Following that, we have : length (count_from n) = n. This is the interesting part, where we say what we actually want to state as a theorem; the length of count_from n is n. Finally, the proof follows as := by sorry. The sorry keyword is used to say that we don’t yet have a proof. It’s useful for intermediate lemmas we might want to use, but we don’t want to stop to prove yet. Most IDEs will show a warning at this line. It is right to be warned, as we can use sorry to prove any statement, even if obviously false (try it out!).

Note: The syntax for stating a theorem might look familiar, as the : is also used to express types. This is no coincidence, and is an example of a much more profound correspondence between propositions and types. See further readings [2] for more information!

Let us now do an actual proof. For this, it is highly recommended you try it on a live Lean 4 environment, and step line-by-line. As you place your cursor on each line, the proof state (which on most environments should be on the right pane of your screen) will show the hypothesis we have, and the proof goal(s) we want to show.

As we did before, let us first look at the whole proof, and then look at each part:

open List (length)

theorem count_from_num (n : Nat) : length (count_from n) = n := by
  induction n with
  | zero =>
      unfold count_from
      unfold length
      rfl
  | succ p ih =>
      simp [count_from, ih]

The first line remains the same, except for getting rid of the sorry keyword. The second line:

induction n with

States that we will use mathematical induction on our natural number n. Notice how the structure of the proof by cases matches the structure of the definition of natural numbers, and the structure of our recursive function. The first case, once more, is the zero case (the base case of the induction):

zero =>

If we place our cursor after the =>, we see that our proof state looks like this:

case zero
⊢ (count_from 0).length = 0

We do not have any hypothesis, and a single proof goal (count_from 0).length = 0. This means we want to show that if we provide 0 to count_from, and apply length to the resulting list, we get a 0 ((count_from 0).length is the same as length (count_from 0)).

Now, we specify a series of tactics. These are tools that provide automations to modify our proof state until the proof is finished. There are lots of tactics; we will only look at a few, but do look at the further readings for more. Looking at the first tactic:

unfold count_from

The unfold tactic replaces the use of a function in the proof goal by its definition. In this case, we unfold count_from, which replaces the count_from 0 in the proof goal. If we place our cursor after it, we see that the proof goal is now:

case zero
⊢ [].length = 0

This is nicer, as we now have to show something that is independent of our count_from function. In fact, the next line:

unfold length

Does the same for the length function, and takes us to a trivial proof state:

case zero
⊢ 0 = 0

Great! We can now finish by calling the rfl tactic, which stands for reflexivity, and solves goals of the form x = x. In some cases, the ac_rfl tactic might be useful, which proves equality up to associativity and commutativity.

Next, we need to show the inductive step:

| succ p ih =>

This points to the succ p case, where $n = p + 1$ (binding p to be the predecessor of $n$). We also bind ih, which refers to the inductive hypothesis. If we look at the proof state now, we have:

case succ
p : ℕ
ih : (count_from p).length = p
⊢ (count_from (p + 1)).length = p + 1

We have two hypotheses, $p \in \mathbb{N}$ (meaning p is of type Nat), and ih, which states that (count_from p).length = p, and is our inductive hypothesis. From these, we want to show the next case: (count_from (p + 1)).length = p + 1. While we could again use unfold to solve step by step, Lean also provides some more powerful tactics with greater automation. In this case, we will use the simp tactic, which simplifies our goal iteratively using different lemmas:

simp [count_from, ih]

The square brackets tell Lean to use count_from and ih to simplify our goal. In fact, this is enough for Lean to finish our proof entirely, as it can use other properties of lists, lengths, and natural numbers to automatically find a proof.

We could also use simp for the base case in order to shorten it:

theorem count_from_num_short (n : Nat) : length (count_from n) = n := by
  induction n with
  | zero => simp [count_from, length]
  | succ p ih =>
    simp [count_from, length, ih]

Or even shorter, as both cases have the same structure,

theorem count_from_num_short (n : Nat) : length (count_from n) = n := by
  induction n <;> simp [count_from, length, *]

The <;> keyword applies the tactics on the right to all the cases on the left. As we no longer bind the ih in the inductive case, we use *, which in Lean refers to all the hypotheses in our proof state.

Note: We can also apply some tactics to hypothesis. For example, simp at ih simplifies our inductive hypothesis. We can also do something like simp [*] at *, which uses all hypotheses to simplify all hypotheses and the proof goal. Other tactics use the same syntax. One that is particularly useful is rw, which uses the given hypotheses in the square brackets to rewrite the proof goal, or if used with at ..., to rewrite one or more hypotheses.

Exercises

Note: Some of the exercises here prove facts about functions in the exercises from the previous section.

Show that reversing a list twice gives back the original list. The theorem statement might look something like this:

theorem reverse_twice_is_id {α : Type} (xs : List α) :
  reverse (reverse xs) = xs := by sorry

(harder) Show that reversing a list preserves its length. The theorem statement might look something like this:

theorem reverse_preserves_length {α : Type} (l : List α) : length (reverse l) = length l := by sorry`

Hint: You might want to show an intermediate lemma for reverse (xs ++ [x]) = x :: reverse xs

Show that appending a list of length $m$ to a list of length $n$ results in a list of length $n + m$. The theorem statement might look something like this (replace append with my_append):

theorem append_length {α : Type} (xs : List α) (ys : List α) :
 length (append xs ys) = length xs + length ys := by sorry

Show that the recursive sum_of_first_n function matches the closed formula ($\frac{n(n+1)}{2}$). This requires using the calc mode taught in Mathematics in Lean, so skip this if you haven’t read the first two chapters (or better yet, go read them!). The theorem statement might look something like this:

theorem closed_eq_sum_of_first_n (n : Nat) : 2 * sum_of_first_n n = n * (n + 1) := by sorry

(harder) Using the BinTree, show that mirroring preserves count [1]. The theorem statement might look something like this:

theorem BinTree.mirror_count {α : Type} (t : BinTree α) :
t.mirror.count = t.count := by sorry

Showing function equivalence

Another important thing we might want to show is that two functions are equivalent. Often times, the easiest way to write a function, which also tends to be the easiest to show properties about in functional languages, is not necessarily the most efficient. Hence, there are cases where we might want to write two versions of a function, a simple one and an optimized one, and show they are equivalent. This allows us to reason about the optimized function by proving properties of the simpler one.

For example, recall our reverse function from before:

def reverse {α : Type} (l : List α) : List α :=
  match l with
  | [] => []
  | x :: xs => reverse xs ++ [x]

While this function is simple and easy to reason about, if we look at the recursive branch, notice that it has to perform the append operation (++) after the recursive call. This means our stack will grow linearly with the length of the list, as the Lean compiler won’t be able to do tail call optimization. In order to fix this, we can move our recursive call to the last step (tail) of our execution [1]:

def tail_reverseHelper {α : Type} (soFar : List α) : List α → List α
  | [] => soFar
  | x :: xs => tail_reverseHelper (x :: soFar) xs

def tail_reverse {α : Type} (xs : List α) : List α :=
  tail_reverseHelper [] xs

Looking at this line:

| x :: xs => tail_reverseHelper (x :: soFar) xs

We can see that the recursive call is the last thing the function does, and thus, Lean is able to use tail call optimization. However, this function is (although in this case, not by much) more complex than the original. Moreover, we already showed some properties about the original function that we do not want to show again. Thus, we will prove they are equivalent. We can state that as a theorem as so [1]:

theorem non_tail_reverse_eq_tail_reverse :
@reverse = @tail_reverse := by sorry

As the functions are polymorphic, we need to use @ to stop Lean from trying to infer what type to use for $\alpha$. To prove this, we will use an intermediate lemma

lemma non_tail_reverse_eq_helper_accum {α : Type} (xs : List α) :
  (soFar : List α) -> tail_reverseHelper soFar xs = (reverse xs) ++ soFar := by sorry

The lemma keyword is basically the same as theorem, but it helps indicate that this is an intermediate result we will use in our main proof. The reason why we want to prove this lemma instead of proving the theorem directly is that the natural way of proving it inductively otherwise gets stuck in the inductive step. To see why, check the Proving Equivalence chapter of Functional Programming in Lean (which contains this example as one of the exercises!).

The (soFar : List α) -> ... notation can be read as “for all lists of α soFar, it is true that…”. The rest of the lemma states that calling the helper function on soFar and xs (which is what the helper function does on the recursive step), is the same as reversing xs and appending soFar at the end.

Let us now look at the proof:

lemma non_tail_reverse_eq_helper_accum {α : Type} (xs : List α) :
  (soFar : List α) -> tail_reverseHelper soFar xs = (reverse xs) ++ soFar := by
  induction xs with
  | nil => simp [tail_reverseHelper, reverse]
  | cons y ys ih =>
  intro soFar
  simp [reverse]
  exact ih (y :: soFar)

Going line by line, we start by stating that we’re doing induction on xs:

induction xs with

The base case can be solved by simply calling simp with the right functions:

| nil => simp [tail_reverseHelper, reverse]

For the inductive step,

| cons y ys ih =>

Our proof state looks like this:

case cons
α : Type
y : α
ys : List α
ih : ∀ (soFar : List α), tail_reverseHelper soFar ys = reverse ys ++ soFar
⊢ ∀ (soFar : List α), tail_reverseHelper soFar (y :: ys) = reverse (y :: ys) ++ soFar

Notice that our proof goal is a for all statement. As is natural with pen-and-paper proofs, we start by fixing and arbitrary instance (which we also call soFar, but we could have called some other name) with the intro tactic (we introduce a list soFar):

intro soFar

This adds a new hypothesis and changes the goal of our proof state, so we no longer have a for all:

soFar : List α
⊢ tail_reverseHelper soFar (y :: ys) = reverse (y :: ys) ++ soFar

Next, we call the simp tactic with the reverse function to simplify our proof state:

simp [reverse]

Which turns our proof goal into this:

⊢ tail_reverseHelper soFar (y :: ys) = reverse ys ++ y :: soFar

Finally, we can use the exact tactic to finish off the proof. This tells Lean that our proof goal matches a certain hypothesis. In this case, we have:

exact ih (y :: soFar)

This states that our proof goal corresponds to ih applied to (y :: soFar). Looking at ih, this corresponds to the following proposition:

tail_reverseHelper (y :: soFar) ys = reverse ys ++ y :: soFar

The right-hand side matches our proof goal exactly, while the left-hand side corresponds to applying tail_reverseHelper once (which Lean is able to automatically realize is the same).

Finally, we can use this to show the equivalence theorem:

theorem non_tail_reverse_eq_tail_reverse :
    @reverse = @tail_reverse := by
  funext α xs
  simp [tail_reverse, non_tail_reverse_eq_helper_accum]

The first tactic,

funext α xs

Is the functional extensionality tactic, which uses the fact that we can show function equivalence by showing they have the same result on all possible inputs. We need to specify the type, $\alpha$, and call the input xs.

This gives the following proof state:

case h.h
α : Type
xs : List α
⊢ reverse xs = tail_reverse xs

A step-by-step proof would probably first unfold tail_reverse, and then use the lemma to show that the recursive step matches in both functions. However, the simp tactic is smart enough to do that on its own, so we simply use:

simp [tail_reverse, non_tail_reverse_eq_helper_accum]

To finish off the proof. Now, we can use this to show that the properties we proved for reverse also apply to tail_reverse:

theorem tail_reverse_preserves_length {α : Type} (l : List α)  
  : length (tail_reverse l) = length l := by rw [<- non_tail_reverse_eq_tail_reverse, reverse_preserves_length]

theorem tail_reverse_twice_is_id {α : Type} (xs : List α) :
  tail_reverse (tail_reverse xs) = xs := by rw [<- non_tail_reverse_eq_tail_reverse, reverse_twice_is_id]

They both follow the same structure, so let us only look at the first one. The rw tactic is doing several steps at once. For clarity, we will separate it into two steps:

theorem tail_reverse_preserves_length {α : Type} (l : List α)  
  : length (tail_reverse l) = length l := by 
  rw [<- non_tail_reverse_eq_tail_reverse]
  rw [reverse_preserves_length]

Looking at the first rw:

rw [<- non_tail_reverse_eq_tail_reverse]

This says we will rewrite the goal using the non_tail_reverse_eq_tail_reverse theorem. The <- arrow says we will use it backwards (we have @reverse = @tail_reverse, but we actually want @tail_reverse = @reverse). This turns our proof goal into:

⊢ (reverse l).length = l.length

But this is what the reverse_preserves_length theorem from the previous exercise said, so we can use it to finish the proof by doing:

rw [reverse_preserves_length]

Both steps can be collapse into a single rw by passing both theorems as above.

Exercises

Tail-recursive sum_of_first_n
1. Write a tail-recursive version of sum_of_first_n, with signature def tail_sum_of_first_n (n : Nat) : Nat
2. Show that the tail-recursive version is equivalent to the normal version
3. Use this to show that the tail-recursive version also matches the closed formula
Do the same for count_from; 1. write a tail-recursive version, 2. show it is equivalent, 3. use this to show that it also produces a list of length $n$

Introduction#

Acknowledgements#

Writing functions in Lean 4#

Aside: Implicit parameters for types#

Exercises#

Verifying properties about functions#

Exercises#

Showing function equivalence#

Exercises#

Further Reading#

Introduction

Acknowledgements

Writing functions in Lean 4

Aside: Implicit parameters for types

Exercises

Verifying properties about functions

Exercises

Showing function equivalence

Exercises

Further Reading