# Arithmetic Expressions

$\newcommand\mdoubleplus{\;{\mathbin{+\mkern-10mu+}}\;}$

Let's formally specify a compiler for arithmetic expressions. First, we need some arithmetic expressions, which we can build with a Backus-Naur-Form: $$\begin{array}{lclclcl} \mathtt{expr}\ni e_1,e_2 & = & n & | & e_1 + e_2 & | & e_1 - e_2 \end{array}$$

, where $n\in\mathbb{N}$.

In order to argue about correctness in a compiler, we need some way to evaluate these expressions on a high-level. This is the purpose of the following function:

$\begin{array}{ccccc} \mathcal{E}&:&\mathtt{expr}&\to&\mathbb{N} \\[0.3cm] \mathcal{E}&n&=&n \\ \mathcal{E}&e_1+e_2&=&(\mathcal{E}\;\;\;e_1) + (\mathcal{E}\;\;\;e_2)\\ \mathcal{E}&e_1-e_2&=&(\mathcal{E}\;\;\;e_1) - (\mathcal{E}\;\;\;e_2) \end{array}$

Now that we have our high-level language, we also want to have a low-level machine language that we can compile to. In our model, this will be a very simple stackmachine with commands:

$\begin{array}{ccccccc} \mathtt{cmd} &=& n & | & add & | & sub \end{array}$

Likewise, we define an evaluation function that simulates the stackmachine.

$\begin{array}{ccccccc} \mathcal{R} &:& \mathtt{com}\;\;list & \to & \mathbb{N}\;\;list & \to & \mathbb{N}\;\;list \\[0.3cm] \mathcal{R} &n::A& B & = & &[n]& \\ \mathcal{R} &add::A& n_1::n_2::B & = & \mathcal{R}&A&(n_1+n_2)::B \\ \mathcal{R} &sub::A& n_1::n_2::B & = & \mathcal{R}&A&(n_1-n_2)::B \\ \mathcal{R} & \_ & \_ & = & & [\;] & \end{array}$

Now, we can define a compiler from $\mathtt{expr}\to\mathtt{com}\;list$ as follows:

$\begin{array}[ccccc] & \gamma &:& \mathtt{expr} & \to & \mathtt{cmd}\;\;list \\[0.3cm] & \gamma & n &=& [n] \\ & \gamma & (e_1+e_2) &=& (\gamma\;\;e_2) \mdoubleplus (\gamma\;\;e_1) \mdoubleplus [add] \\ & \gamma & (e_1-e_2) &=& (\gamma\;\;e_2) \mdoubleplus (\gamma\;\;e_1) \mdoubleplus [sub] \\ \end{array}$

Proving compiler correctness now means to prove the following lemma:

$\mathcal{R}\;\;(\gamma\;e)\;\;[\;] = [\mathcal{E}\;\;e]$

This can easily happen by induction, however, we need to strengthen the induction hypothesis. So, what we actually show is this:

$\mathcal{R}\;\;((\gamma\;e)\mdoubleplus A)\;\;B = [\mathcal{E}\;\;e] \mdoubleplus B$

Here is the Coq-formalization:

Previous