Introduction to Formal Compiler Security

11th Sep 2023

Formal Compiler Security

This blog-post tries to give a formal introduction to compiler security. If you are already familiar with the term robust preservation, this blog-post is likely not for you. The post is accompanied by a Coq mechanization, i.e., all proofs presented in this blog-post are machine-checked. Find the link for it at the bottom of the post.

Preliminaries

Why formal methods?

Starting out in computer science, I've been very much into practical stuff. Writing "cool" algorithms, optimizing the hell out of the classic raytracer that everyone has to write at some point, and, well, designing a compiler. My first ever compiler was for a subset of the C programming language and it was inherently incorrect: It failed to faithfully translate the recursive version of Fibonacci: The compiler just wrote to some static memory location and forgot about the contents of the variables by overwriting the old location. In other words, my compiler did not make use of the stack at all, neither did it simulate it somehow, ending up with compiled programs that don't behave the way I expected them to behave.

This expectation for program behavior should be written down somewhere, e.g., the C programming language dictates in its standard ( Citation: WG14/N1256 2007 WG14/N1256. 2007. Programming languages — C. ) the way a C program should run. In there, one can see a bunch of Backus-Naur-Forms defining the concrete syntax. Moreover, prose describes the semantics (what should happen under certain consequences, e.g., calling a function pointer is the same as dereferencing and then calling it). These rules give a more or less precise understanding how the programming language works and what programmers expect. Even though there have been studies that suggest differently ( Citation: Xu et.al. 2023 Xu, Jianhao and Lu, Kangjie and Du, Zhengjie and Ding, Zhu and Li, Linke and Wu, Qiushi and Payer, Mathias and Mao, Bing. 2023. Silent Bugs Matter: A Study of Compiler-Introduced Security Bugs. ) , the point I'm trying to make here is that every C programmer has some kind of mental model of how their program behaves. It turns out that specifying the semantics of programming languages is itself a tricky problem. As an example, for C, in certain contexts, the evaluation order is left implementation-defined, so there can be differences in the result of the expression f() + g() for the following program when compiled with different C compilers:

int x = 0;
int f() { x = x - 1; return x; }
int g() { x = x + 1; return x; }

If f() is evaluated first, the value of f() + g() is -1. If g() is evaluated first, the value of f() + g() is 1.

A formal, mathematical, specification gives a precise understanding on how something works. While it do can be ambiguous by design (by introducing non-determinism as a "feature" to the language), it cannot be ambiguous "by language". It is also the most convincing argument one can do: Starting with stuff that is irrefutably obvious, inducing other facts from these by means of rules given by the logic we do reasoning in, there is no chance to say "your argument doesn't make sense" without questioning the logic to begin with. It is the very core reason why we do math/logic.

Utilizing formal methods to prove that programs adhere to a specification is also much more powerful than unit- or property-based testing. Consider the following program:

void add(int x, int y) {
  if(x == 17290001 && y == 29171373) {
    return x - y;
  }
  return x + y;
}

For almost all inputs x and y, it is correct. A 💩ton of tests are still very unlikely to find the bug where the function does not fulfill the intended specification "add number x and y".

What are properties?

So, we want to verify programs. One way of looking at programs is to do so abstractly: Programs may perform some events that may have an effect to the "outside world", like reading from or writing to memory. These events are also known as atomic propositions, commonly referred to as $\operatorname{AP}$. This post will only use the terminology "event". The formal description of an event can be done with a Backus-Naur-Form without any metavariable/non-terminal symbol: $$ \begin{aligned} \mi{(\text{Pre-Event})}\ a_p &:= \text{Start} \mid \text{End}\ v \\ \mi{(\text{Event})}\ a &:= \varepsilon \mid a_p \end{aligned} $$ This is an example of a formal definition for concrete events: the empty event ($\varepsilon$), the event signaling that a program starts ($\text{Start}$), and the event signaling program termination ($\text{End}\ v$). The empty event is usually included to refer to unimportant/unobservable/internal actions (you may see this being referred to as $\lambda$ instead of $\varepsilon$ in the literature, some even use $\tau$). Note that $\text{End}$ carries a parameter $v$, which I have not defined precisely, but which is supposed to be "just some value", like true, false, or 1729.

A program may emit multiple events (written $\overline{a}$) and, depending on what properties interest us, the concrete events wanted from program execution may need to change. In the verification domain, programs yield (potentially) infinitely sized lists of events, which are known as traces.

Even in this very simple setting where the only "interesting" events are $\text{Start}$ and $\text{End}\ v$, it is possible to define interesting properties for programs, which will be done soon. First up, it is necessary to make sense of the term property from the formal perspective: A trace property ($\pi$) is simply a set of traces (e.g., Trace -> Prop in Coq). For example, the property "termination" may be formulated as "any trace $\overline{a}$ which at some point contains an $\text{End}\ v$ event". More formally: $$ \pi_{\text{term}} = \{ \overline{a} \mid \exists v, \overline{a} \vDash \diamond \text{End}\ v \} $$ Hereby, the $\vDash$ means "satisfies" and the $\diamond$ is a unary operator reading "eventually". $\pi_{\text{term}}$ is an example for a liveness property, which intuitively concerns that "something good happens". Even if the good thing did not happen yet, it could still happen at some point!

Having seen a liveness property, let us see an example for another class of properties, the safety properties. For this, let's try to formulate "non-termination": "any trace $\overline{a}$ which never contains an $\text{End}\ v$ event". Formally: $$ \pi_{\text{no-term}} = \{ \overline{a} \mid \forall v, \overline{a} \vDash \square \neg\text{End}\ v \} $$ The $\square$ reads "always" and the $\neg$ is simply "not".

Safety and liveness are the two big properties, but there also exist properties which are neither safety nor liveness (because they are both!). But anyhow, the crucial bit to keep in mind for the rest of the blog post is that properties are sets of traces.

(Robust) Satisfaction

Suppose $\gamma$ is a compiler that translates from language $\src{S}$ to language $\trg{T}$.

Def. Property Satisfaction
A program $w$ satisfies a property $\pi$ ($w\vdash\pi$) if its execution yields a trace $\overline{a}$ such that $\overline{a}\in\pi$.

It is easy for a compiler to make sure target programs never go wrong: Just always emit a program that corresponds to the constant $\trg{42}$. But, this is very much not satisfactory. The intuition of programmers is that compiled $\trg{T}$-level programs behave similar to the original $\src{S}$ program! This "behaves similar" is precisely the intuition for preservation, given some compiler $\gamma$:

Def. Preservation of Property Satisfaction
For any property $\pi$ and $\src{S}$-level program $\src{w}$, if $\src{w}\vdash\pi$, then $\gamma(\src{w})\vdash\pi$.

Notice how $\src{w}\vdash\pi$ has a hidden assumption: program execution of $\src{w}$ can actually yield a trace. The issue is that this is not how a programmer writes a program. To illustrate why, consider this C function:

void foo() {
  printf("hello");
}

A very basic C function that makes crucial use of printf, a function that is not defined in the above excerpt! In fact, one would have to compile this snippet and then link the result with an implementation of the C standard library, which provides a definition for printf. While this is only one of many possible scenarios, in reality, people use libraries daily, no questions asked. To accurately model preservation of properties with this in mind, the blog-post refers to programs written by the programmer as components and to any library providing definitions for, e.g., undefined functions, as contexts. Property satisfaction is only defined on whole programs, so it does not work for components, because they are partial programs where not every symbol has an accompanying definition. A natural extension to make is to generalize this to arbitrary contexts. To this end, revisit property satisfaction and take arbitrary contexts into account:

Def. Robust Property Satisfaction
A component $p$ robustly satisfies a property $\pi$ ($p\vdash_R\pi$) if there is a whole program $w$ that is the result of linking an arbitrary context $C$ with the component $p$ such that the execution of $w$ yields a trace $\overline{a}$ that fulfills $\overline{a}\in\pi$.

Consider this C function and the compiled ARM64 output:

extern int arr[5];

int foo(int idx) {
  return arr[idx];
}

foo(int):
  adrp x1, arr
  add x1, x1, :lo12:arr
  ldr w0, [x1, w0, sxtw 2]
  ret

Looking at the code, the compiler translates this as expected: perform an array access to some pre-defined constant arr. However, it is easy to see that there is a potential memory safety issue: A context that somewhere invokes foo(1729) will lead to an out-of-bounds memory access, which is a spatial memory safety violation. This means $\texttt{foo}\vdash_R\text{smssafe}$ does not hold, since $\texttt{foo}$ does not satisfy the property $\text{smssafe}$ (no out-of-bounds accesses) for all possible contexts.

A potential fix is to make it robust by inserting a bounds-check:

extern int arr[5];

int foo(int idx) {
  if(0 <= idx && idx < 5) {
    return arr[idx];
  }
  abort();
}

Now, $\texttt{foo}\vdash_R\text{smssafe}$ holds.

Secure Compilation

The lifting of preservation of a source behavior in the target simply uses robust satisfaction :

Def. Robust Trace Property Preservation
For any property $\pi$ and $\src{S}$-level program $\src{p}$, if $\src{p}\vdash_R\pi$, then $\gamma(\src{p})\vdash_R\pi$.

Notably, in the assumption $\src{p}\vdash_R\pi$ , the context is at $\src{S}$-level, while for the goal $\gamma(\src{p})\vdash_R\pi$ the context is at $\trg{T}$-level. Now, if you think of $\trg{T}$ as something like ARM64, program contexts easily appear much more powerful than for $\src{S}$, which could be something like Rust. In Rust, one is not able to just guess some memory address (without unsafe), while this works without issue in ARM64, assuming one guesses correctly.

Lets see how one could try to prove robust preservation . On the left there are the assumptions that we are allowed to work with, on the right there is the goal we need to prove. The diagram is supposed to be read from top to bottom.

Have Comment Want $\pi$ : $\text{Properties}$ $\src{p}$ : $\src{S}\text{-level component}$ $H_1$ : $\src{p}\vdash_R\pi$ $\comp{\src{S}}{\trg{T}}{\src{p}}$ $H_1'$ : $\begin{array}{l}\forall\trace\in\text{Trace},\\\forall\src{C}\in\src{S}\text{-level program},\\\text{if }\src{link\ C\ p}\to\trace,\\\text{then }\trace\in\pi\end{array}$ (unfold $\vdash_R$) $\begin{array}{l}\forall\trace\in\text{Trace},\\\forall\trg{C}\in\trg{T}\text{-level program},\\\text{if }\trg{link\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\to\trace,\\\text{then }\trace\in\pi\end{array}$ $\trace$ : Trace (assume from goal) $\trg{C}$ : $\trg{T}\text{-level program}$ $H_{\trg{T}\text{-exec}}$ : $\trg{link\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\to\trace$ $\trace\in\pi$

Looking at the proof context, i.e., the assumptions that we have, we have to show that this trace $\overline{a}$ is part of whatever property $\pi$, where the only thing we know about $\overline{a}$ is that it is the result of the execution $\trg{link\ C\ }\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}\to\overline{a}$. Since $\pi$ is arbitrary, there is little hope to show that $\overline{a}$ is an element of it by decomposing this $\trg{T}$-level execution. But, note that there is also assumption $H_1'$ . We certainly can specialize it with $\overline{a}$, but then we need to provide an $\src{S}$-level context as well as an execution $\src{link\ C\ p}\to\overline{a}$, both of which we do not have. This is the crucial part in these kinds of robust preservation proofs: somehow coming up with an $\src{S}$-level context $\src{C}$ whose runtime behavior when linked with $\src{p}$ is similar to that of $\trg{link\ C\ }\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}$.

In the rest of this post, we'll see how to do this on the basis of backtranslating the trace $\overline{a}$.

$\src{S}$ Definitions

$\src{S}$ Syntax

$$ \begin{aligned} \mi{(\text{Value})}\ \src{v} &:= \src{true} \mid \src{false} \mid \src{n} \ \ \ \ \ \ \ \ \ \ \text{, where } \src{n} \text{ is a number}\\ \mi{(\text{Expression})}\ \src{e} &:= \src{val\ v} \mid \src{var\ x} \mid \src{e_1 \oplus e_2} \mid \src{if\ e_1\ then\ e_2\ then\ e_3} \mid \src{abort} \\ \mi{(\text{Type})}\ \src{\tau} &:= \src{\mathbb{N}} \mid \src{\mathbb{B}} \mid \src{\tau\to\tau} \end{aligned} $$

(** Possible binary operations. *)
Variant binopsymb : Type :=
| Badd : binopsymb
| Bsub : binopsymb
| Bmul : binopsymb
| Bdiv : binopsymb
| Bless : binopsymb
.
(** Value *)
Inductive value : Type :=
| Vnat : nat -> value
| Vbool : bool -> value
.
(** Variables are just strings. *)
Definition vart := string.

(** Expression *)
Inductive expr : Type :=
| Xval (v : value) : expr
| Xvar (x : vart) : expr
| Xbinop (symb : binopsymb) (lhs rhs : expr) : expr
| Xif (c e0 e1 : expr) : expr
| Xabort : expr
.
(** Type *)
Inductive ty : Type :=
| Tnat : ty
| Tbool : ty
| Tfun : ty -> ty -> ty
.

Hereby, $\src{x}$ is some kind of string and $\src{\oplus}$ is one of the following binary operations: $\src{<}$, $\src{+}$, $\src{\times}$, $\src{/}$, and $\src{-}$. We do not encode precedence or anything into this formal description, because it is more meant to be a generator for abstract syntax trees instead of a context free grammar. That being said, the precedences of the binary operators are "as usual".

Another rather syntactic comment I have to make is the fact that some metavariables appear "free" in the above definition, e.g., $\src{e_1}$ in the binary operation rule. I sketch the "basename" of the metavariable $\src{e}$ and add subscripts as a means to signal that both branches in an expression like $\src{e_1\oplus e_2}$ can be different. If I'd have written it as $\src{e\oplus e}$, there could be confusion that you have to insert the same syntactic block into each branch, i.e., you can only have expressions like $\src{val\ 1+val\ 1}$, but not $\src{val\ 1+val\ 2}$. Admittedly, the rule for expressions should quantify over all used metavariables, i.e., the full, formal definition of the Backus-Naur-Form rule would look like $\src{Exprs}\ni\src{e_1,e_2,e_3}::=\dots$, where "$\dots$" as above. $\src{Exprs}$ hereby describes a set that contains all words adhering to the syntax on the righthandside of the ::= operator. Anyhow, I think this notation is still somewhat understandable while lessening the burden of parsing what I've written.

While we have expressions now, we still lack contexts and whole programs. The following definitions make this precise:

$$ \begin{aligned} \mi{(\text{Components})}\ \src{p} &:= (\src{x};\src{\tau_1};\src{\tau_2};\src{e})\ \ \ \ \ \ \ \ \ \mi{(\text{Contexts})}\ \src{C} := (\src{y};\src{e_{pre}};\src{e_{post}})\\ &\mi{(\text{Whole Programs})}\ \src{W} := \src{prog\ C\ p} \end{aligned} $$

(** Symbols look like `fn foo x : τ := e` *)
Definition symbol : Type := vart * ty * ty * expr.

(** Viable contexts that can be linked against a symbol. *)
Inductive LinkageContext : Type :=
  (* basically models `let y = call foo e__pre in e__post` *)
| LContext (y : vart) (e__pre e__post : expr) : LinkageContext
.

(** Top-level programs *)
Inductive prog : Type := Cprog (C : LinkageContext) (foo : symbol) : prog.

Again, this is meant to be abstract syntax. The intuitive understanding for compontents is simply a function, in imaginative syntax, $\src{fn(x : \tau) : \tau := e}$. Similarily, contexts are simply a let-expression that "calls" into this function, again in imaginative syntax, $\src{let\ y\ =\ call\ p\ e_{pre}\ in\ e_{post}}$. We'll see the dynamic semantics of this soon, but first, let's take a look at the static semantics, i.e., what it means for a program to be considered valid/safe.

$\src{S}$ Static Semantics

$\vdash_v\src{true}:\src{\mathbb{B}}$ $\text{\footnotesize($\src{S}$-t-true)}$ $\vdash_v\src{false}:\src{\mathbb{B}}$ $\text{\footnotesize($\src{S}$-t-false)}$ $\src{n}\text{ is a number}$ $\vdash_v\src{n}:\src{\mathbb{N}}$ $\text{\footnotesize($\src{S}$-t-num)}$ $\vdash_v\src{v}:\src{\tau}$ $\src{\Gamma}\vdash_e\src{val\ v}:\src{\tau}$ $\text{\footnotesize($\src{S}$-t-val)}$ $(\src{x},\src{\tau})\in\src{\Gamma}$ $\src{\Gamma}\vdash_e\src{var\ x}:\src{\tau}$ $\text{\footnotesize($\src{S}$-t-var)}$ $\src{\Gamma}\vdash_e\src{abort}:\src{\tau}$ $\text{\footnotesize($\src{S}$-t-abort)}$ $\begin{array}{ rcl } \src{\Gamma}\vdash_e\src{e_1}:\src{\mathbb{N}} && \src{\Gamma}\vdash_e\src{e_2}:\src{\mathbb{N}} \\ \src{\oplus}&\text{ is not }&\src{<}\end{array}$ $\src{\Gamma}\vdash_e\src{e_1\oplus e_2}:\src{\mathbb{N}}$ $\text{\footnotesize($\src{S}$-t-binop)}$ $\begin{array}{ lr } \src{\Gamma}\vdash_e\src{e_1}:\src{\mathbb{N}} & \src{\Gamma}\vdash_e\src{e_2}:\src{\mathbb{N}}\end{array}$ $\src{\Gamma}\vdash_e\src{e_1 < e_2}:\src{\mathbb{B}}$ $\text{\footnotesize($\src{S}$-t-less)}$ $\begin{array}{ ccc } \src{\Gamma}\vdash_e \src{e_1}:\src{\mathbb{B}} & \src{\Gamma}\vdash_e \src{e_2}:\src{\tau} & \src{\Gamma}\vdash_e \src{e_3}:\src{\tau}\end{array}$ $\src{\Gamma}\vdash_e\src{if\ e_1\ then\ e_2\ else\ e_3}:\src{\tau}$ $\text{\footnotesize($\src{S}$-t-if)}$

(** Type system spec *)
Reserved Notation "'[' Γ '|-' e ':' τ  ']'" (at level 81, Γ at next level, e at next level, τ at next level).
Inductive check : Gamma -> expr -> ty -> Prop :=
| T_var : forall (Γ : Gamma) (x : vart) (τ : ty),
    nodupinv Γ ->
    Min Γ x τ ->
    [Γ |- Xvar x : τ]
| T_abort : forall (Γ : Gamma) (τ : ty),
    [Γ |- Xabort : τ]
| T_vnat : forall (Γ : Gamma) (n : nat),
    [Γ |- Xval(Vnat n) : Tℕ]
| T_vbool : forall (Γ : Gamma) (b : bool),
    [Γ |- Xval(Vbool b) : Tbool]
| T_binop : forall (Γ : Gamma) (b : binopsymb) (e1 e2 : expr),
    b <> Bless ->
    [Γ |- e1 : Tℕ] ->
    [Γ |- e2 : Tℕ] ->
    [Γ |- Xbinop b e1 e2 : Tℕ]
| T_binop_less : forall (Γ : Gamma) (e1 e2 : expr),
    [Γ |- e1 : Tℕ] ->
    [Γ |- e2 : Tℕ] ->
    [Γ |- Xbinop Bless e1 e2 : Tbool]
| T_if : forall (Γ : Gamma) (e0 e1 e2 : expr) (τ : ty),
    [Γ |- e0 : Tbool] ->
    [Γ |- e1 : τ] ->
    [Γ |- e2 : τ] ->
    [Γ |- Xif e0 e1 e2 : τ]
where "'[' Γ '|-' e ':' τ ']'" := (check Γ e τ)
.

The typing rules for expressions are rather straightforward. Typechecking binary operations is split into $\src{S}\text{-t-binop}$ and $\src{S}\text{-t-less}$ , since comparisons should return a boolean value and anything else a number.

Typechecking contexts, components, and whole programs closely follows the standard "let"-rule when typechecking more complicated programming languages.

$(\src{x};\src{\tau_1}),\src{[\cdot]}\vdash_e \src{e}:\src{\tau_2}$ $\vdash_p(\src{x};\src{\tau_1};\src{\tau_2};\src{e}):\src{\tau_1\to\tau_2}$ $\text{\footnotesize($\src{S}$-t-component)}$ $\begin{array}{ cc } \src{[\cdot]}\vdash_e\src{e_{pre}}:\src{\mathbb{N}} & (\src{y};\src{\mathbb{N}}),\src{[\cdot]}\vdash_e \src{e_{post}}:\src{\tau}\end{array}$ $\vdash_c(\src{y};\src{e_{pre}};\src{e_{post}}):\src{\tau}$ $\text{\footnotesize($\src{S}$-t-context)}$ $\begin{array}{ cc } \vdash_c\src{C}:\src{\tau} & \vdash_p\src{p}:\src{\mathbb{N}\to\mathbb{N}}\end{array}$ $\vdash\src{prog\ C\ p}:\src{\tau}$ $\text{\footnotesize($\src{S}$-t-whole)}$

(** Checking of symbols. A symbol may use the identifier x, because a symbol
    in this language is basically fn(x) ↦ e. *)
Inductive check_symb : symbol -> Prop :=
| check_symb_c : forall (e : expr) (x : vart),
    check (x ↦ Tℕ ◘ [⋅]) e Tℕ ->
    check_symb (x, Tℕ, Tℕ, e)
.
(** Similarily, contexts are like `let y = call foo e__pre in e__post` *)
Inductive check_ctx : LinkageContext -> Prop :=
| check_ctx_c : forall (y : vart) (e__pre e__post : expr),
    [[⋅] |- e__pre : Tℕ] ->
    [y ↦ Tℕ ◘ [⋅] |- e__post : Tℕ] ->
    check_ctx (LContext y e__pre e__post)
.
(** Checking a top-level program just means checking both context and component *)
Definition prog_check (p : prog) : Prop :=
  let '(Cprog C foo) := p in
  check_symb foo /\ check_ctx C
.

$\src{S}\text{-t-whole}$ restricts components to be of type $\src{\mathbb{N}\to\mathbb{N}}$.

$\src{S}$ Dynamic Semantics

Since the language is so simple, the dynamic semantics are also simple, like the typing rules. The semantics is done small-step style with evaluation contexts, thereby sidestepping the need for congruence rules. Evaluation does not happen on expressions themselves, but rather on "runtime expressions". Moreover, execution may give visible events (something like syscalls), which warrants a trace-model:

$$ \begin{aligned} \mi{(\text{Runtime Expressions})}\ \src{r} &:= \src{\triangleright\ e} \mid \src{\rtexpr{}{⚡}} \\ \mi{(\text{Events})}\ \src{a} &:= \src{\varepsilon} \mid \text{⚡} \mid \src{Call\ v} \mid \src{Ret\ v} \mid \src{Start} \mid \src{End\ v} \\ \mi{(\text{Traces})}\ \src{\trace} &:= \src{[\cdot]} \mid \src{\event\cdot\trace} \end{aligned} $$

(** A runtime program is just an expression or a crashed state. *)
Inductive rtexpr : Type :=
| RTerm (e : expr)
| RCrash
.
#[global]
Notation "'▷' e" := ((RTerm e) : rtexpr) (at level 81, e at next level).
#[global]
Notation "'▷↯'" := (RCrash).

With this, primitive reductions are defined as follows:

$\src{\rtexpr{}{abort}\psteparrow{⚡}\rtexpr{}{\text{⚡}}}$ $\text{\footnotesize($\src{S}$-e-crash)}$ $\src{\rtexpr{}{if\ val\ true\ then\ e_1\ else\ e_2\psteparrow{} e_1}}$ $\text{\footnotesize($\src{S}$-e-if-true)}$ $\src{\rtexpr{}{if\ val\ false\ then\ e_1\ else\ e_2\psteparrow{} e_1}}$ $\text{\footnotesize($\src{S}$-e-if-false)}$ $\operatorname{eval\_binop}(\src{\oplus};\src{n_1};\src{n_2})=\src{v}$ $\src{\rtexpr{}{val\ n_1\oplus val\ n_2\psteparrow{} val\ v}}$ $\text{\footnotesize($\src{S}$-e-binop)}$

(** Evaluation of binary expressions. Note that 0 means `true` in S, so `5 < 42` evals to `0`. *)
Definition eval_binop (b : binopsymb) (n0 n1 : nat) : option value :=
  Some((match b with
       | Bless => (if Nat.ltb n0 n1 then Vbool true else Vbool false)
       | Badd => Vnat(n0 + n1)
       | Bdiv => Vnat(Nat.div n0 n1)
       | Bsub => Vnat(n0 - n1)
       | Bmul => Vnat(n0 * n1)
       end))
.
(** Primitive evaluation steps. *)
Inductive pstep : (PrimStep rtexpr) :=
| e_binop : forall (n1 n2 : nat) v (b : binopsymb),
    Some(v) = eval_binop b n1 n2 ->
    ▷ Xbinop b n1 n2 --[]--> ▷ v
| e_if_true : forall (e1 e2 : expr),
    ▷ <( if true then (e1) else (e2) )> --[]--> ▷ e1
| e_if_false : forall (e1 e2 : expr),
    ▷ <( if false then (e1) else (e2) )> --[]--> ▷ e2
| e_abort :
    ▷ <( abort )> --[ Scrash ]--> ▷↯
.

All steps without an event above the arrow implicitly throw the empty event ($\src{\varepsilon}$). While these definitions allow us to evaluate a runtime expression like $\src{\rtexpr{}{val\ 1+val\ 2}}$, it fails to apply for stuff like $\src{\rtexpr{}{(val\ 1+val\ 2)+val\ 3}}$. In order to be able to reduce these more complicated expressions, we decompose them into evaluation contexts and primitive expressions. The idea of evaluation contexts is that of a continuation, i.e., "the remaining continuation". When evaluating something like $\src{\rtexpr{}{(val\ 1+val\ 2)+val\ 3}}$, there is an outermost part that stays the same when evaluating, in this case $\src{\square+val\ 3}$. So, let's define evaluation contexts:

$$ \begin{aligned} \mi{(\text{Evaluation Contexts})}\ \src{E} &:= \src{\square} \mid \src{E \oplus e} \mid \src{n \oplus E} \mid \src{if\ E\ then\ e_1\ else\ e_2} \\ \end{aligned} $$

(** We proceed to define the dynamic semantics via evaluation contexts/environments. *)
Inductive evalctx : Type :=
| Khole : evalctx
| KbinopL (b : binopsymb) (K : evalctx) (e : expr) : evalctx
| KbinopR (b : binopsymb) (n : nat) (K : evalctx) : evalctx
| Kif (K : evalctx) (e0 e1 : expr) : evalctx
.

Filling the hole ($\src{\square}$) of an evaluation context with some expression $\src{e}$ is written $\src{E[e]}$. With that, define context-based reduction as follows:

$\src{\rtexpr{}{e}\psteparrow{\event}\rtexpr{}{e'}}$ $\src{\rtexpr{}{E[e]}\ectxsteparrow{\event}\rtexpr{}{E[e']}}$ $\text{\footnotesize($\src{S}$-e-ectx)}$ $\src{\rtexpr{}{e}\psteparrow{\text{⚡}}\rtexpr{}{\text{⚡}}}$ $\src{\rtexpr{}{E[e]}\ectxsteparrow{\text{⚡}}\rtexpr{}{\text{⚡}}}$ $\text{\footnotesize($\src{S}$-e-ectx-crash)}$

(* Contextual steps may either have a subexpression that can do a primstep or
   have a subexpression that crashes. *)
Inductive estep : EctxStep rtexpr :=
| E_ectx : forall (e e' e0 e0' : expr) (o : option event) (K : evalctx),
    e0 = insert K e ->
    e0' = insert K e' ->
    ▷ e --[, o ]--> ▷ e' ->
    ▷ e0 ==[, o ]==> ▷ e0'
| E_stuck : forall (e e0 : expr) (K : evalctx),
    e0 = insert K e ->
    ▷ e --[ Scrash ]--> ▷↯ ->
    ▷ e0 ==[ Scrash ]==> ▷↯
.

To evaluate multiple steps, simply take the reflexive transitive closure $\src{\stepsarrow{\trace}}$. To evaluate multiple steps in $n$, simply count down $n$ for each transitive step and only if it is $0$ allow reflexive steps. When taking multiple steps, the trace $\src{\trace}$ only collects events that are non-empty - and, the way we have defined the relations, it's either a singleton list containing the crash event (⚡), or it is the empty list. The other events will become relevant for the top-level execution, which we define now.

$\begin{array}{ c } \vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} } \\ \src{ \triangleright e_{foo}}[\text{subst }\src{x}\text{ for }\src{val\ v_{pre}}]\src{ \nstepsarrow{}{n'} \rtexpr{}{val\ v_{foo}} } \\ \src{ \triangleright e_{post} }[\text{subst }\src{y}\text{ for }\src{val\ v_{foo}}]\src{\nstepsarrow{}{n''}\rtexpr{}{val\ v_{post}}}\end{array}$ $\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot End\ v_{post}}\rtexpr{}{val\ v_{post}} }$ $\text{\footnotesize($\src{S}$-eprog-success)}$ $\begin{array}{ c } \src{\rtexpr{}{e_{pre}} \nstepsarrow{⚡}{n}\rtexpr{}{⚡} }\end{array}$ $\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot ⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\src{S}$-eprog-fail0)}$ $\begin{array}{ c } \vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} } \\ \src{ \triangleright e_{foo}}[\text{subst }\src{x}\text{ for }\src{val\ v_{pre}}]\src{ \nstepsarrow{⚡}{n'} \rtexpr{}{⚡} }\end{array}$ $\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot ⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\src{S}$-eprog-fail1)}$ $\begin{array}{ c } \vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} } & \src{ \triangleright e_{foo}}[\text{subst }\src{x}\text{ for }\src{val\ v_{pre}}]\src{ \nstepsarrow{}{n'} \rtexpr{}{val\ v_{foo}} } \\ \src{ \triangleright e_{post} }[\text{subst }\src{y}\text{ for }\src{val\ v_{foo}}]\src{\nstepsarrow{⚡}{n''}\rtexpr{}{⚡}}\end{array}$ $\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot ⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\src{S}$-eprog-fail2)}$ $\begin{array}{ c } \neg\left( \vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \right) \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} }\end{array}$ $\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\src{S}$-eprog-fail3)}$

Inductive progstep : prog -> tracepref -> rtexpr -> Prop :=
| prog_step_success : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v v__pre v__foo : value)
                        (n__pre n__foo n__post : nat),
    check_symb foo ->
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    e__foo = expr_of_symbol foo ->
    (* setup the argument for the function call *)
    ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
    (* do the function call *)
    ▷ (subst x__foo e__foo (Xval v__pre)) =( n__foo )=[ nil ]==> ▷ Xval v__foo ->
    (* something may happen after the function call *)
    ▷ subst y e__post (Xval v__foo) =( n__post )=[ nil ]==> ▷ Xval v ->
    progstep (Cprog (LContext y e__pre e__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Send (sv v) :: nil)
             (RTerm (Xval v))
| prog_step_failure0 : forall (y : vart) (e__pre e__post : expr) (foo : symbol) (n__pre : nat),
    (* wether component is well-typed or not does not matter if the context yields a crash *)
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    (* setup the argument for the function call ; fails *)
    ▷ e__pre =( n__pre )=[ Scrash :: nil ]==> (▷↯) ->
    progstep (Cprog (LContext y e__pre e__post) foo)
             (Sstart :: Scrash :: nil)
             (▷↯)
| prog_step_failure1 : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v__pre : value)
                         (n__pre n__foo : nat),
    check_symb foo ->
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    e__foo = expr_of_symbol foo ->
    (* setup the argument for the function call *)
    ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
    (* do the function call ; fails *)
    ▷ (subst x__foo e__foo (Xval v__pre)) =( n__foo )=[ Scrash :: nil ]==> (▷↯) ->
    progstep (Cprog (LContext y e__pre e__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Scrash :: nil)
             (▷↯)
| prog_step_failure2 : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v__pre v__foo : value)
                         (n__pre n__foo n__post : nat),
    check_symb foo ->
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    e__foo = expr_of_symbol foo ->
    (* setup the argument for the function call *)
    ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
    (* do the function call *)
    ▷ (subst x__foo e__foo (Xval v__pre)) =( n__foo )=[ nil ]==> ▷ Xval v__foo ->
    (* something may happen after the function call; fails *)
    ▷ subst y e__post (Xval v__foo) =( n__post )=[ Scrash :: nil ]==> (▷↯) ->
    progstep (Cprog (LContext y e__pre e__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Scrash :: nil)
             (▷↯)
| prog_step_failure3 : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v__pre : value)
                         (n__pre : nat),
    (* this is a dirty hack to sidestep more formalization work. we want ill-typed components to "crash", but it's a hassle
       to extend small step and bigstep accordingly. So, instead, we do it at the top level execution relation of programs *)
    ~(check_symb foo) ->
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    e__foo = expr_of_symbol foo ->
    (* setup the argument for the function call *)
    ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
    (* since the component is ill-typed, just crash immediately *)
    progstep (Cprog (LContext y e__pre e__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Scrash :: nil)
             (▷↯)
.

Since the language only has one call, it can be hardcoded in the semantics by simulating the reduction of a nested let-expression, as done in $\src{S}\text{-eprog-success}$ .

The rules $\src{S}\text{-eprog-fail0}$ , $\src{S}\text{-eprog-fail1}$ , and $\src{S}\text{-eprog-fail2}$ encode all possibilities where either the context or the component crashes: the component could fail or the context can fail prior to calling in $\src{e_{pre}}$ or it can fail in $\src{e_{post}}$ after the component returned. Lastly, $\src{S}\text{-eprog-fail3}$ is a rather nasty/lazy rule that exists for technical reasons. The rule itself is "wrong" in the sense that not all ill-typed programs are able to reduce the first part of the context $\src{e_{pre}}$. So, pedantically, there should be yet another rule for ill-typed programs where $\src{e_{pre}}$ may fail. Anyhow, this situation could/should also be a lemma that is proven somewhere, i.e. the component should fail if the whole program is ill-typed, and this is something that is preferrable, but not done here, out of lazyness. The main secure compilation proof is not really affected by this. 😅

Robust satisfaction for $\src{S}$-level programs is now defined as:

Def. $\src{S}$-level Robust Property Satisfaction
$$ \src{p}\text{ robustly satisfies }\pi \text{ iff } \forall \src{C},\text{if }\src{prog\ C\ p \progarrow{\trace} r}\text{, then }\trace\in\pi $$

$\trg{T}$ Definitions

$\trg{T}$ Syntax

To make compilation at least somewhat interesting, let's make $\trg{T}$ a stack-machine. It is worth noting that, while we do have control here over the (formal) definition of the target language of the soon-to-be-defined compiler, in practice, it's only $\src{S}$ that you have control over, if at all. More often than not for practically relevant languages, even $\src{S}$ is specified by, e.g., some ISO standard ( Citation: WG14/N1256 2007 WG14/N1256. 2007. Programming languages — C. ) . Anyways, the commands of $\trg{T}$ are as follows:

$$ \begin{aligned} \mi{(\text{Values})}\ \trg{v} &:= \trg{true} \mid \trg{false} \mid \trg{n} \ \ \ \ \ \ \ \ \ \ \text{, where } \trg{n} \text{ is a number} \\ \mi{(\text{Types})}\ \trg{\tau} &:= \trg{\nat} \mid \trg{\bool} \\ \mi{(\text{Commands})}\ \trg{c} &:= \trg{push\ v} \mid \trg{bop\ \oplus} \mid \trg{abort} \mid \trg{has\ \tau} \mid \trg{quote\ [\overline{c}]} \\ &\ \ \mid \trg{unquote} \mid \trg{if} \mid \trg{get\ x} \mid \trg{negb} \\ \mi{(\text{Components})}\ \trg{p} &:= (\trg{x};\trg{\overline{c}})\ \ \ \ \ \ \mi{(\text{Contexts})}\ \trg{C} := (\trg{y};\trg{\overline{c_1}};\trg{\overline{c_2}}) \\ \mi{(\text{Whole Programs})}\ \trg{W} &:= \trg{prog\ C\ p} \end{aligned} $$

(** Values are numbers or booleans. *)
Inductive value : Type :=
| Vnat : nat -> value
| Vbool : bool -> value
.
(** _Dynamic_ Types of T *)
Inductive ty : Type :=
| Tnat : ty
| Tbool : ty
.
(** List of commands available in T. The abstract syntax is just a list *)
Inductive cmd : Type :=
| Cpush : value -> cmd
| Cbop : binopsymb -> cmd
| Cabort : cmd
| Chas : ty -> cmd
| Cquote : list cmd -> cmd
| Cunquote : cmd
| Cif : cmd
| Cget : vart -> cmd
| Cnegb : cmd
.
(** Components *)
Definition symbol : Type := vart * list cmd.

(** Viable contexts that can be linked against a symbol. *)
Inductive LinkageContext : Type :=
  (* basically models `let y = call foo e__pre in e__post` *)
| LContext (y : vart) (cmds__pre cmds__post : list cmd) : LinkageContext
.
(** Top-level programs *)
Inductive prog : Type := Cprog (C : LinkageContext) (foo : symbol) : prog.

The notation $\trg{\overline{c}}$ means "list of $\trg{c}$".

$\trg{T}$ Dynamic Semantics

As before, we first define runtime programs. Contrary to $\src{S}$, runtime programs in $\trg{T}$ depend on a stack and a heap:

$$ \begin{aligned} \mi{(\text{Stack\ Element})}\ \trg{s} &:= \trg{v} \mid \trg{\overline{c}} & \mi{(\text{Stack})}\ \trg{\Psi} &:= \trg{[\cdot]} \mid \trg{s,\Psi} \\ \mi{(\text{Heap})}\ \trg{\Delta} &:= \trg{\emptyset} \mid \trg{x\mapsto v} & \mi{(\text{Runtime Programs})}\ \trg{r} &:= \trg{\rtexpr{\Psi;\Delta}{\overline{c}}} \mid \trg{\rtexpr{}{⚡}}& \end{aligned} $$

(** Stack elements are either values or code. *)
Inductive stack_el : Type :=
| Sval : value -> stack_el
| Scode : list cmd -> stack_el
.
Definition stack : Type := list stack_el.
Definition store : Type := option (vart * value).

(** A runtime program is a stack and heap with a list of commands or a crashed state. *)
Inductive rtprog : Type :=
| RPTerm (Ψ : stack) (Δ : store) (cmds : list cmd)
| RPCrash
.

The "heap" is a singleton, it can store at most one variable. Stacks may either contain a list of commands as one element or a value.

Since the trace-models of $\src{S}$ and $\trg{T}$ are exactly the same, they are not listed in the above definition. Just color everything related to the trace model in the previous definition red.

In terms of evaluation, the semantics does not need contexts, since it's just popping off commands off of the list of commands in a given runtime program. The context is simply the remaining list of commands, but this is being taken care of by the reflexive-transitive closure as well as the step-indexed version. Most commands are straightforward:

$\trg{\rtexpr{true,\Psi;\Delta}{negb,\overline{c}}\ectxsteparrow{}\rtexpr{false,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-negb-true)}$ $\trg{\rtexpr{false,\Psi;\Delta}{negb,\overline{c}}\ectxsteparrow{}\rtexpr{true,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-negb-false)}$ $\trg{\rtexpr{\Psi;x\mapsto v}{get\ x,\overline{c}}\ectxsteparrow{}\rtexpr{v,\Psi;x\mapsto v}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-get)}$ $\trg{\rtexpr{true,\overline{c_{bdy}},\Psi;\Delta}{if,\overline{c}}\ectxsteparrow{}\rtexpr{\Psi;\Delta}{\overline{c_{bdy}},\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-if-true)}$ $\trg{\rtexpr{false,\overline{c_{bdy}},\Psi;\Delta}{if,\overline{c}}\ectxsteparrow{}\rtexpr{\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-if-false)}$ $\trg{\rtexpr{\overline{c_{1}},\Psi;\Delta}{unquote,\overline{c_2}}\ectxsteparrow{}\rtexpr{\Psi;\Delta}{\overline{c_1},\overline{c_2}}}$ $\text{\footnotesize($\trg{T}$-e-unquote)}$ $\trg{\rtexpr{\Psi;\Delta}{quote\ [\overline{c_1}],\overline{c_2}}\ectxsteparrow{}\rtexpr{\overline{c_1},\Psi;\Delta}{\overline{c_2}}}$ $\text{\footnotesize($\trg{T}$-e-quote)}$ $\trg{\rtexpr{\Psi;\Delta}{push\ v,\overline{c}}\ectxsteparrow{}\rtexpr{v,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-push)}$ $\text{eval\_binop}(\trg{\oplus};\trg{n_1};\trg{n_2})=\trg{v}$ $\trg{\rtexpr{n_1,n_2,\Psi;\Delta}{bop\ \oplus,\overline{c}}\ectxsteparrow{}\rtexpr{v,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-bop)}$ $\trg{\rtexpr{\Psi;\Delta}{abort,\overline{c}}\ectxsteparrow{}\rtexpr{}{⚡}}$ $\text{\footnotesize($\trg{T}$-e-abort)}$ $\trg{\rtexpr{n,\Psi;\Delta}{has\ \nat,\overline{c}}\ectxsteparrow{}\rtexpr{true,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-has$\nat$-true)}$ $\trg{s}\not=\trg{n}$ $\trg{\rtexpr{s,\Psi;\Delta}{has\ \nat,\overline{c}}\ectxsteparrow{}\rtexpr{false,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-has$\nat$-false)}$ $\begin{array}{ rcl } \trg{s}\not=\trg{true} & \text{ and } & \trg{s}\not=\trg{false}\end{array}$ $\trg{\rtexpr{s,\Psi;\Delta}{has\ \bool,\overline{c}}\ectxsteparrow{}\rtexpr{false,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-has$\bool$-false)}$ $\begin{array}{ rcl } \trg{s}=\trg{true} & \text{ or } & \trg{s}=\trg{false}\end{array}$ $\trg{\rtexpr{s,\Psi;\Delta}{has\ \bool,\overline{c}}\ectxsteparrow{}\rtexpr{true,\Psi;\Delta}{\overline{c}}}$ $\text{\footnotesize($\trg{T}$-e-has$\bool$-true)}$

(* Contextual steps just eat off the top of the list, i.e., the context is the rest of the list. *)
Inductive estep : EctxStep rtprog :=
| e_negb : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (b : bool),
    s(Sval b :: Ψ ; Δ ▷ Cnegb :: cmds) ==[]==> s(Sval (negb b) :: Ψ ; Δ ▷ cmds)
| e_get : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (x : vart) (v : value),
    Some (x, v) = Δ ->
    s(Ψ ; Δ ▷ Cget x :: cmds) ==[]==> s(Sval v :: Ψ ; Δ ▷ cmds)
| e_if_true : forall (Ψ : stack) (Δ : store) (cmds thens : list cmd),
    s(Sval true :: Scode thens :: Ψ ; Δ ▷ Cif :: cmds) ==[]==> s(Ψ ; Δ ▷ thens ++ cmds)
| e_if_false : forall (Ψ : stack) (Δ : store) (cmds thens : list cmd),
    s(Sval false :: Scode thens :: Ψ ; Δ ▷ Cif :: cmds) ==[]==> s(Ψ ; Δ ▷ cmds)
| e_unquote : forall (Ψ : stack) (Δ : store) (cmds what : list cmd),
    s(Scode what :: Ψ ; Δ ▷ Cunquote :: cmds) ==[]==> s(Ψ ; Δ ▷ what ++ cmds)
| e_quote : forall (Ψ : stack) (Δ : store) (cmds what : list cmd),
    s(Ψ ; Δ ▷ Cquote what :: cmds) ==[]==> s(Scode what :: Ψ ; Δ ▷ cmds)
| e_hasnat_true : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (s : stack_el) (n : nat),
    s = Sval n ->
    s(s :: Ψ ; Δ ▷ Chas Tℕ :: cmds) ==[]==> s(Sval true :: Ψ ; Δ ▷ cmds)
| e_hasnat_false : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (s : stack_el),
    (forall (n : nat), s <> Sval n) ->
    s(s :: Ψ ; Δ ▷ Chas Tℕ :: cmds) ==[]==> s(Sval false :: Ψ ; Δ ▷ cmds)
| e_hasbool_true : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (s : stack_el) (b : bool),
    s = Sval b ->
    s(s :: Ψ ; Δ ▷ Chas T𝔹 :: cmds) ==[]==> s(Sval true :: Ψ ; Δ ▷ cmds)
| e_hasbool_false : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (s : stack_el),
    (forall (b : bool), s <> Sval b) ->
    s(s :: Ψ ; Δ ▷ Chas T𝔹 :: cmds) ==[]==> s(Sval false :: Ψ ; Δ ▷ cmds)
| e_abort : forall (Ψ : stack) (Δ : store) (cmds : list cmd),
    s(Ψ ; Δ ▷ Cabort :: cmds) ==[ Scrash ]==> ▷↯
| e_bop : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (n1 n2 : nat) (b : binopsymb) (v : value),
    Some v = eval_binop b n1 n2 ->
    s(Sval n2 :: Sval n1 :: Ψ ; Δ ▷ Cbop b :: cmds) ==[]==> s(Sval v :: Ψ ; Δ ▷ cmds)
| e_push : forall (Ψ : stack) (Δ : store) (cmds : list cmd) (v : value),
    s(Ψ ; Δ ▷ Cpush v :: cmds) ==[]==> s(Sval v :: Ψ ; Δ ▷ cmds)
.

Quotation can push lists of commands onto the stack and is a necessity for non-trivial control-flow via the $\trg{if}$ command. $\trg{if}$ always expects at least two elements on the stack, where the first is a bool and the second gives the consequence that should be run next if that bool is true. Typechecking can only happen dynamically by inspecting the shape of the top-element on the stack, which ultimately gets replaced by the typecheck with a boolean that indicates wether the stack element has said type or not. Lists of commands, that are on the stack, do not have any type and yield for both $\trg{has\ \nat}$ and $\trg{has\ \bool}$ the value $\trg{false}$.

Lastly, we need to define execution of whole programs, which is entirely similar to that of $\src{S}$, albeit without typechecks. Instead of substituting the variable for the value when simulating the single function call, the heap is populated with the value.

$\begin{array}{ c } \trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{}{n}\rtexpr{v_{pre},\Psi;\emptyset}{[\cdot]} } \\ \trg{ \rtexpr{\Psi;x\mapsto v_{pre}}{\overline{c_{foo}}} \nstepsarrow{}{n'} \rtexpr{v_{foo},\Psi;x\mapsto v_{pre}}{[\cdot]} } \\ \trg{ \rtexpr{v_{foo},\Psi;\emptyset}{\overline{c_{post}}} \nstepsarrow{}{n''}\rtexpr{v_{post},[\cdot];\emptyset}{[\cdot]}}\end{array}$ $\trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot End\ v_{post}}\rtexpr{v_{post},[\cdot];\emptyset}{[\cdot]} }$ $\text{\footnotesize($\trg{T}$-eprog-success)}$ $\begin{array}{ c } \trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{⚡}{n}\rtexpr{}{⚡} }\end{array}$ $\trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot ⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\trg{T}$-eprog-fail0)}$ $\begin{array}{ c } \trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{}{n}\rtexpr{v_{pre},\Psi;\emptyset}{[\cdot]} } \\ \trg{ \rtexpr{\Psi;x\mapsto v_{pre}}{\overline{c_{foo}}} \nstepsarrow{⚡}{n'} \rtexpr{}{⚡} }\end{array}$ $\trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot Call\ v_{pre}\cdot ⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\trg{T}$-eprog-fail1)}$ $\begin{array}{ c } \trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{}{n}\rtexpr{v_{pre},\Psi;\emptyset}{[\cdot]} } \\ \trg{ \rtexpr{\Psi;x\mapsto v_{pre}}{\overline{c_{foo}}} \nstepsarrow{}{n'} \rtexpr{v_{foo},\Psi;x\mapsto v_{pre}}{[\cdot]} } \\ \trg{ \rtexpr{v_{foo},\Psi;\emptyset}{\overline{c_{post}}} \nstepsarrow{⚡}{n''}\rtexpr{}{⚡}}\end{array}$ $\trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot ⚡}\rtexpr{}{⚡} }$ $\text{\footnotesize($\trg{T}$-eprog-fail2)}$

(** Defines top-level program execution. *)
Inductive progstep : prog -> tracepref -> rtprog -> Prop :=
| prog_step_success : forall (x__foo y : vart) (cmds__pre cmds__post cmds__foo : list cmd) (foo : symbol) (v v__pre v__foo : value)
                           (Ψ : stack) (n__pre n__foo n__post : nat),
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    cmds__foo = body_of_symbol foo ->
    (* setup the argument for the function call *)
    s(nil ; None ▷ cmds__pre) =( n__pre )=[ nil ]==> (s(Sval v__pre :: Ψ ; None ▷ nil)) ->
    (* do the function call *)
    s(Ψ ; Some (x__foo, v__pre) ▷ cmds__foo) =( n__foo )=[ nil ]==> s(Sval v__foo :: Ψ ; Some (x__foo, v__pre) ▷ nil) ->
    (* something may happen after the function call *)
    s(Sval v__foo :: Ψ ; None ▷ cmds__post) =( n__post )=[ nil ]==> s(Sval v :: nil ; None ▷ nil) ->
    progstep (Cprog (LContext y cmds__pre cmds__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Send (sv v) :: nil)
             (RPTerm (Sval v :: nil) None nil)
| prog_step_failure0 : forall (y : vart) (cmds__pre cmds__post : list cmd) (foo : symbol) (n__pre : nat),
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    (* setup the argument for the function call ; fails *)
    s(nil ; None ▷ cmds__pre) =( n__pre )=[ Scrash :: nil ]==> (▷↯) ->
    progstep (Cprog (LContext y cmds__pre cmds__post) foo)
             (Sstart :: Scrash :: nil)
             (RPCrash)
| prog_step_failure1 : forall (x__foo y : vart) (cmds__pre cmds__post cmds__foo : list cmd) (foo : symbol) (v__pre : value)
                            (Ψ__pre : stack) (n__pre n__foo : nat),
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    cmds__foo = body_of_symbol foo ->
    (* setup the argument for the function call *)
    s(nil ; None ▷ cmds__pre) =( n__pre )=[ nil ]==> (s(Sval v__pre :: Ψ__pre ; None ▷ nil)) ->
    (* do the function call ; fails *)
    s(Ψ__pre ; Some(x__foo, v__pre) ▷ cmds__foo) =( n__foo )=[ Scrash :: nil ]==> (▷↯) ->
    progstep (Cprog (LContext y cmds__pre cmds__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Scrash :: nil)
             (RPCrash)
| prog_step_failure2 : forall (x__foo y : vart) (cmds__pre cmds__post cmds__foo : list cmd) (foo : symbol) (v__pre v__foo : value)
                            (Ψ__pre : stack) (n__pre n__foo n__post : nat),
    (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
       To actually emit traces, we just hardcode them at the top-level... *)
    x__foo = vart_of_symbol foo ->
    cmds__foo = body_of_symbol foo ->
    (* setup the argument for the function call *)
    s(nil ; None ▷ cmds__pre) =( n__pre )=[ nil ]==> (s(Sval v__pre :: Ψ__pre ; None ▷ nil)) ->
    (* do the function call *)
    s(Ψ__pre ; Some(x__foo, v__pre) ▷ cmds__foo) =( n__foo )=[ nil ]==> s(Sval v__foo :: Ψ__pre ; Some(x__foo, v__pre) ▷ nil) ->
    (* something may happen after the function call; fails *)
    s(Sval v__foo :: Ψ__pre ; None ▷ cmds__post) =( n__post )=[ Scrash :: nil ]==> (▷↯) ->
    progstep (Cprog (LContext y cmds__pre cmds__post) foo)
             (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Scrash :: nil)
             (RPCrash)
.

Compiler from $\src{S}$ to $\trg{T}$

The compilation of values is a trivial recoloring, although one could map $\src{true}$ to $\trg{false}$ and $\src{false}$ to $\trg{true}$, or some number, to complicate matters (not done here, because why??).

$$ \begin{aligned} \mi{(\src{v}\text{ Compiler})}\ & \comp{\src{S}}{\trg{T}}{\src{true}}_v =\ \trg{true} \\ & \comp{\src{S}}{\trg{T}}{\src{false}}_v =\ \trg{false} \\ & \comp{\src{S}}{\trg{T}}{\src{n}}_v =\ \trg{n} \end{aligned} $$

(** Compiling values is just a "recoloring", nothing interesting happening here *)
Definition compile_v (v : S.value) : T.value :=
  match v with
  | S.Vnat n => T.Vnat n
  | S.Vbool b => T.Vbool b
  end
.

Now, for expressions, most parts are simple, since they do not involve any structurual recursion with regards to the expression syntax. By that metric, the binary operations as well as the conditionals are the interesting cases.

$$ \begin{array}{rrcl} &\mi{(\src{e}\text{ Compiler})}&&\\[1em] & \comp{\src{S}}{\trg{T}}{\src{val\ v}}_e &=&\ \trg{push}\ \comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{,[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{var\ x}}_e &=&\ \trg{x,[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{abort}}_e &=&\ \trg{abort,[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e &=&\ \comp{\src{S}}{\trg{T}}{\src{e_2}}_e,\comp{\src{S}}{\trg{T}}{\src{e_1}}_e,\trg{bop\ \oplus},\trg{[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{if\ e_1\ then\ e_2\ else\ e_3}}_e &=&\ \trg{quote\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,if,}\\ & && \ \trg{quote\ }\comp{\src{S}}{\trg{T}}{\src{e_3}}_e\trg{,negb,}\trg{if,[\cdot]} \\ \end{array} $$

(** Compiling expressions the more or less standard way when compiling to stack-based languages *)
Fixpoint compile_e (e : S.expr) : list T.cmd :=
  match e with
  | S.Xval v => T.Cpush (compile_v v) :: nil
  | S.Xvar x => T.Cget x :: nil
  | S.Xbinop symb e1 e2 => compile_e e1 ++ compile_e e2 ++ T.Cbop symb :: nil
  | S.Xif e1 e2 e3 => T.Cquote (compile_e e2) :: compile_e e1 ++ T.Cif
                    :: T.Cquote (compile_e e3) :: compile_e e1 ++ T.Cnegb :: T.Cif :: nil
  | S.Xabort => T.Cabort :: nil
  end
.

The compilation of binary expressions converts from infix-style to somewhat reverse polish. Importantly (and contrary to reverse polish), the second subexpression needs to be pushed on the stack prior to the first subexpression. Consider the compilation of $\src{val\ 1+val\ 2}$, which will result in the list of commands $\trg{push\ 2,push\ 1,bop\ +}$. Starting with the empty stack and empty heap, by transitivity of $\trg{\stepsarrow{\trace}}$ and applying $\trg{T}$-e-push twice and concluding with $\trg{T}$-e-bop , we have: $$ \begin{array}{lc} \trg{\rtexpr{[\cdot];\emptyset}{push\ 2,push\ 1,bop\ +,[\cdot]}}&\trg{\stepsarrow{}}\\ \trg{\rtexpr{2,[\cdot];\emptyset}{push\ 1,bop\ +,[\cdot]}}&\trg{\stepsarrow{}}\\ \trg{\rtexpr{1,2,[\cdot];\emptyset}{bop\ +,[\cdot]}}&\trg{\stepsarrow{}}\\ \trg{\rtexpr{3,[\cdot];\emptyset}{[\cdot]}}& \end{array} $$ Similarily for conditionals. In contrast to $\src{S}$, where conditionals always have both consequence and alternative cases, the conditional in $\trg{T}$ only has a consequence. Because of this, a single $\src{S}$ conditional compiles into two $\trg{T}$ conditionals, where the second one uses the negated version of the compilation of the conditional.

Finally, compilation of components relies on the $\src{S}$ components to be well-typed, i.e., there is some function $\operatorname{check}$ which satisfies the equivalence $\operatorname{check}(\src{p})=\src{\tau_1\to\tau_2}$ iff $\vdash_p\src{p}:\src{\tau_1\to\tau_2}$.

$$ \begin{aligned} \mi{(\text{Wrapper})}&\\ &\operatorname{wrap}(\src{x})=\trg{quote\ [abort,[\cdot]],get\ }\comp{\src{S}}{\trg{T}}{\src{x}}\trg{, has\ \nat, negb, if, [\cdot]}\\[1em] \mi{(\src{p}\text{ Compiler})}&\\[-2ex] &\comp{\src{S}}{\trg{T}}{(\src{x};\src{\tau_1};\src{\tau_2};\src{e_{bdy}})}_v = \ \begin{cases} \operatorname{wrap}(\src{x})\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_{bdy}}}\trg{,[\cdot]} & \text{if }\operatorname{check}((\src{x};\src{\tau_1};\src{\tau_2};\src{e_{bdy}}))=\src{\tau_1\to\tau_2} \\ \trg{abort,[\cdot]} & \text{otherwise} \end{cases} \\ \end{aligned} $$

(** Assume there exists an implementation of a typechecker for S. *)
(** This is a nice exercise to try and extend S/static.v with a `Fixpoint`
    defining the `check` axiom as well as a lemma proving `check_equiv_Scheck`. *)
Axiom check : S.symbol -> bool.
Axiom check_equiv_Scheck : forall (foo : S.symbol), check foo = true <-> S.check_symb foo.

(** To compile symbols, the compiler has to ensure that the target context cannot pass ill-typed arguments. *)
(** Furthermore, it requires that the component is well-typed. *)
Definition compile_symb' (foo : S.symbol) : T.symbol :=
  let x__foo := S.vart_of_symbol foo in
  let e__foo := S.expr_of_symbol foo in
  let wrapper := T.Cquote (T.Cabort :: nil) :: T.Cget x__foo
    :: T.Chas T.Tnat :: T.Cnegb :: T.Cif :: nil in
  (x__foo, wrapper ++ compile_e e__foo)
.
Definition compile_symb (foo : S.symbol) : T.symbol :=
  if check foo then
    compile_symb' foo
  else
    (S.vart_of_symbol foo, T.Cabort :: nil)
.

Crucially, the compiler inserts wrapper code to ensure that the component is robust with respect to well-typing. Consider an insecure compiler (that is nonetheless a correct compiler) which does not insert the wrapper code. Suppose it compiles the $\src{S}$-level component $(\src{x};\src{\nat};\src{\nat};\src{val\ 1 - var\ x})$. The compiled component of this is $(\trg{x};\trg{get\ x,push\ 1,bop\ -,[\cdot]})$. Now, note that there is no possible $\src{S}$-level context $\src{C}$ that when linked to this component renders the resulting whole program ill-typed (see $\src{S}$-t-whole and $\src{S}$-t-context ). However, consider the $\trg{T}$-level context $(\trg{y};\trg{true};\trg{y})$ and observe the component-level execution (see $\trg{T}$-eprog-success ) of the whole program that results from linking that context with the above compiled component: $$ \begin{array}{lc} \trg{\rtexpr{\Psi;x\mapsto true}{get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{true,\Psi;x\mapsto true}{push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{1,true,\Psi;x\mapsto true}{bop\ -,[\cdot]} } & \text{stuck!} \end{array} $$ The rule $\trg{T}$-e-bop is the only reasonable candidate to reduce this further, but it does not apply, since binary operations with mixed arguments (one a number, the other a boolean) are left undefined. Put differently, the values mismatch, i.e., there is a (dynamic) typing error! So, with this, it is easy to see that the compiler without the wrapper code is not enough to preserve type safety: While type safety does hold robustly in $\src{S}$, it does not in $\trg{T}$. So, the compiler has to do something to ensure the robust preservation of type safety.

Let's see what happens with the execution of the component when the wrapper code is prepended to the component, the bad context stays the same: $$ \begin{array}{lc} \trg{\rtexpr{\Psi;x\mapsto true}{quote\ [abort,[\cdot]],get\ x,has\ \nat,negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{[abort,[\cdot]],\Psi;x\mapsto true}{get\ x,has\ \nat,negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{true,[abort,[\cdot]],\Psi;x\mapsto true}{has\ \nat,negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{false,[abort,[\cdot]],\Psi;x\mapsto true}{negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{true,[abort,[\cdot]],\Psi;x\mapsto true}{if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{\Psi;x\mapsto true}{abort,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{⚡}} \\ \trg{\rtexpr{}{⚡} } & \\ \end{array} $$ With the wrapper code, the compiler prevents the execution of any component-level code past this wrapper if the precondition that the argument has type "number" is not met.

The rest of this post now attempts to sketch a proof that this compiler can indeed be considered "secure", because it robustly preserves our property of interest: well-typing.

Simulations

An incorrect, but secure, compiler is a questionable goal. $\src{S}$ components could always compile to $\trg{abort}$ and - heyoho - security! But it is immediate that such a compiler is undesirable. That's why we also take a small detour to compiler correctness, which is actually a spectrum ( Citation: Patterson et.al. 2019 Patterson, Daniel and Ahmed, Amal. 2019. The next 700 Compiler Correctness Theorems (Functional Pearl). ) and not a binary property (like, it's not just "yeah it is correct", it's also things like "it's kinda correct").

Intuitively, what is wanted from compiler correctness, is that "anything the source does, the compiled version also does". This correspondence cannot easily happen in lockstep: either source or target language have different features and the execution may differ for a bit. For example, consider a relation between runtime expressions and runtime programs ($\approx$) which, in whatever sense, shall mean "these two are the quote-on-quote same". Then, the following figure depicts a very simple situation in languages $\src{S}$ and $\trg{T}$ where the $\src{S}$ program takes less steps than the accompanying $\trg{T}$ program. Each downward arrow is a $\ectxsteparrow{\varepsilon}$-step in either $\src{S}$ or $\trg{T}$.

Because of these kinds of discrepancies, there are several different flavors of simulation diagrams in literature. For our purposes, we are only interested in a straightforward forward simulation (😅) as well as a backward simulation. The forward simulation intuitively captures "the source does steps, so does the compiled version" and the backward simulation is dual, i.e., "the compiled version does steps, so does the source". We'll see their lemma statements and a proofsketch for both of these in the following subsections.

Forward Simulation

Suppose we have an execution $\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}}$ that reaches the final runtime expression $\src{\rtexpr{}{v}}$ in $n$ steps. Morally, the result we want is that there is some $\trg{n'}$ such that $\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n'}}\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{v}}_e$. But this does not work out even in one of the most simple cases where $\src{e}=\src{val\ 1}$: $$ \begin{array}{lc|lc} \src{\triangleright\ val\ 1} & \src{\nstepsarrow{}{0}} & \trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{val\ 1}}_e & = \\ \src{\triangleright\ val\ 1} & & \trg{\rtexpr{[\cdot];\emptyset}{push\ 1}} & \trg{\nstepsarrow{}{1}} \\ &&\trg{\rtexpr{1,[\cdot];\emptyset}{[\cdot]}} & \trg{\nstepsarrow{}{0}} \\ &&\trg{\rtexpr{1,[\cdot];\emptyset}{[\cdot]}} & \\ \end{array} $$

While the $\src{S}$-level execution is reflexive on $\src{\rtexpr{}{1}}$, the corresponding, compiled version does not have a reflexive $\trg{T}$-level execution. There, the final state is actually $\trg{\rtexpr{1,[\cdot]}{[\cdot]}}$. This is easy to fix: what we actually want is that there exists an $\trg{n}$ such that $\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n}}\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$.

Let's try to prove:

Lem. Forward Simulation (1st try)$$\begin{array}{l}\text{If }\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}},\\\text{then }\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}\end{array}$$
Proof.
Induction on $\src{e}$. This section only lists a few cases.

Case $\src{e}=\src{val\ v_0}$:

Have Comment Want $\src{n}$ : $\text{Number}$ $\src{v,v_0}$ : $\src{S}\text{ value}$ $H_{\src{exec}}$ : $\src{\rtexpr{}{val\ v_0}\nstepsarrow{}{n}\rtexpr{}{val\ v}}$ $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{val\ v_0}}_e\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ $H_{\src{exec}}'$ : $\src{v_0}=\src{v}$ ('invert' $\ H_{\src{exec}}$) $H_{\src{exec}}''$ : $\src{n}=\src{0}$ (unfold $\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$) $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ push\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ (let $\ \trg{n'}=\trg{1}$) $\trg{[\cdot];\emptyset\ \triangleright\ push\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\ectxsteparrow{}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ (by $\ \trg{T}\text{-e-push}$)

Case $\src{e}=\src{e_1\oplus e_2}$:

Have Comment Want $\text{IH}_1$ : $\forall \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_1}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ $\text{IH}_2$ : $\forall \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_2}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ $\src{n}$ : $\text{Number}$ $\src{e_1,e_2}$ : $\src{S}\text{ exprs}$ $\src{v}$ : $\src{S}\text{ value}$ $H_{\src{exec}}$ : $\src{\rtexpr{}{e_1\oplus e_2}\nstepsarrow{}{n}\rtexpr{}{val\ v}}$ $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ $\src{m_1,m_2}$ : $\src{S}\text{ number/value}$ ('invert' $\ H_{\src{exec}}$) $\src{n_1,n_2}$ : $\text{Number}$ $H_{\src{exec_1}}$ : $\src{\rtexpr{}{e_1}\nstepsarrow{}{n_1}\rtexpr{}{val\ m_1}}$ $H_{\src{exec_2}}$ : $\src{\rtexpr{}{e_2}\nstepsarrow{}{n_2}\rtexpr{}{val\ m_2}}$ $H_{\src{exec}}'$ : $\src{n}=1+\src{n_1}+\src{n_2}$ $H_{\src{exec}}''$ : $\src{v}=\text{eval\_binop}(\src{\oplus};\src{m_1};\src{m_2})$ (unfold$\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$) $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ Now, as in the base-case, we would like to partially execute the runtime program $\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,[\cdot]}$. But, given we don't know much about $\src{e_2}$, the only hope is to make use of $\text{IH}_2$ . For this, we specialize $\text{IH}_2$ with $\src{n_2}$ , $\src{m_2}$ , and $H_{\src{exec_2}}$ to obtain $\text{IH}_2'$: $\trg{n_2'}$ : Number $\text{IH}_2'$ : $\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{\nstepsarrow{}{n_{2}'}}\comp{\src{S}}{\trg{T}}{\src{m_{2}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ Let's try and let $\trg{n'}=\trg{n_2'}+\trg{42}$. By "transitivity" of $\trg{\stepsarrow{}{n_0+m_0}}$ (for any $\trg{n_0,m_0}$), the goal can be split into two parts (given a suitable stack $\trg{\Psi}$):
$\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n_2'}\ }\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}$
$\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{42}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$
The interesting failure happens in the first goal . Notice how $\text{IH}_2'$ only tells us that a runtime program that has a list of commands $\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,[\cdot]}$ goes to $\trg{[\cdot]}$. Unfortunately, in the case of the goal , however, we want to show that a runtime program with $\comp{\src{S}}{\trg{T}}{\src{e_2}}_e$ and some remaining list of commands, namely $\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}$, goes to another runtime program with that same remaining list of commands. The proof is stuck.

The proof attempt goes by induction and suffers from the fact that the statement of the lemma is too specific! (lists of commands need to have exactly this shape...) A way to fix this is to just generalize the lemma.

Lem. Forward Simulation (2nd try)$$\begin{array}{l}\text{If }\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}},\\\text{then }\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{,\overline{c}}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c}}}\end{array}$$
Proof.
Induction on $\src{e}$. This section continues where the attempted proof earlier failed.

Case $\src{e}=\src{e_1\oplus e_2}$:

Have Comment Want $\text{IH}_1$ : $\forall \trg{\overline{c_{IH}}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_1}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c_{IH}},[\cdot]}}$ $\text{IH}_2$ : $\forall \trg{\overline{c_{IH}}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_2}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c_{IH}},[\cdot]}}$ Both $\text{IH}_1$ and $\text{IH}_2$ are more general this time. $\src{n}$ : $\text{Number}$ $\src{e_1,e_2}$ : $\src{S}\text{ exprs}$ $\src{v}$ : $\src{S}\text{ value}$ $H_{\src{exec}}$ : $\src{\rtexpr{}{e_1\oplus e_2}\nstepsarrow{}{n}\rtexpr{}{val\ v}}$ $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e\trg{,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$ $\src{m_1,m_2}$ : $\src{S}\text{ number/value}$ ('invert' $\ H_{\src{exec}}$) $\src{n_1,n_2}$ : $\text{Number}$ $H_{\src{exec_1}}$ : $\src{\rtexpr{}{e_1}\nstepsarrow{}{n_1}\rtexpr{}{val\ m_1}}$ $H_{\src{exec_2}}$ : $\src{\rtexpr{}{e_2}\nstepsarrow{}{n_2}\rtexpr{}{val\ m_2}}$ $H_{\src{exec}}'$ : $\src{n}=1+\src{n_1}+\src{n_2}$ $H_{\src{exec}}''$ : $\src{v}=\text{eval\_binop}(\src{\oplus};\src{m_1};\src{m_2})$ (unfold$\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$) $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$ As earlier, we specialize $\text{IH}_2$ , but this time with $\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}$, $\src{n_2}$ , $\src{m_2}$ , and $H_{\src{exec_2}}$ to obtain $\text{IH}_2'$. Specialize $\text{IH}_1'$ in similar fashion but with list of commands $\trg{bop\ \oplus,[\cdot]}$ and obtain: $\trg{n_1',n_2'}$ : Number $\text{IH}_2'$ : $\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{2}'}}\comp{\src{S}}{\trg{T}}{\src{m_{2}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{,}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}$ $\text{IH}_1'$ : $\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{1}'}}\comp{\src{S}}{\trg{T}}{\src{m_{1}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{bop\ \oplus,\overline{c},[\cdot]}}$ $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$ The proof continues by setting $\trg{n'}=\trg{n_2'+(n_1'+1)}$ obtaining the goal $\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n_2'+(n_1'+1)}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$. By "transitivity" of $\trg{\nstepsarrow{}{n_0+m_0}}$ (for any $\trg{n_0,m_0}$), the goal can be split (as done in the other attempt, this time setting the stack to $\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot]}$):
$\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_2'}\ }\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}$
$\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_1'+1}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$
The first goal has exactly the shape of $\text{IH}_2'$ , so this follows by assumption. In order to prove the other goal , split again by "transitivity" (using stack $\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_1}}\trg{,[\cdot]}$):
$\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_1'}\ }\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_1}}\trg{\rtexpr{,[\cdot];\emptyset}{bop\ \oplus,\overline{c},[\cdot]}}$
$\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_1}}\trg{,[\cdot];\emptyset\ \triangleright\ bop\ \oplus,\overline{c},[\cdot]}\trg{\ectxsteparrow{}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$
Second goal goes by $\trg{T}\text{-e-bop}$ and $H_{\src{exec}}''$ . For the other goal , we would really like to apply $\text{IH}_1'$ . But, notice how the induction hypothesis requires an empty stack $\trg{[\cdot]}$, while the current state of the stack for this goal is $\comp{\src{S}}{\trg{T}}{\src{m_2}}_2\trg{,[\cdot]}$. So, the proof is again stuck, because the induction hypothesis is not general enough.

Let's try again, this time also generalizing the whole statement on stacks:

Lem. Forward Simulation (3rd try)$$\begin{array}{l}\text{If }\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}},\\\text{then }\exists\trg{n'},\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{,\overline{c}}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,\Psi;\emptyset}{\overline{c}}}\end{array}$$
Proof.
Induction on $\src{e}$.

Case $\src{e}=\src{e_1\oplus e_2}$:

Have Comment Want $\text{IH}_1$ : $\forall \trg{\overline{c_{IH}}}\ \trg{\Psi_{IH}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_1}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{\Psi_{IH};\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,\Psi_{IH};\emptyset}{\overline{c_{IH}},[\cdot]}}$ $\text{IH}_2$ : $\forall \trg{\overline{c_{IH}}}\ \trg{\Psi_{IH}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_2}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{\Psi_{IH};\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,\Psi_{IH};\emptyset}{\overline{c_{IH}},[\cdot]}}$ $\text{IH}_1$ and $\text{IH}_2$ are, again, more general. $\src{n}$ : $\text{Number}$ $\src{e_1,e_2}$ : $\src{S}\text{ exprs}$ $\src{v}$ : $\src{S}\text{ value}$ $H_{\src{exec}}$ : $\src{\rtexpr{}{e_1\oplus e_2}\nstepsarrow{}{n}\rtexpr{}{val\ v}}$ $\exists\trg{n'},\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e\trg{,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,\Psi;\emptyset}{\overline{c},[\cdot]}}$ $\src{m_1,m_2}$ : $\src{S}\text{ number/value}$ ('invert' $\ H_{\src{exec}}$) $\src{n_1,n_2}$ : $\text{Number}$ $H_{\src{exec_1}}$ : $\src{\rtexpr{}{e_1}\nstepsarrow{}{n_1}\rtexpr{}{val\ m_1}}$ $H_{\src{exec_2}}$ : $\src{\rtexpr{}{e_2}\nstepsarrow{}{n_2}\rtexpr{}{val\ m_2}}$ $H_{\src{exec}}'$ : $\src{n}=1+\src{n_1}+\src{n_2}$ $H_{\src{exec}}''$ : $\src{v}=\text{eval\_binop}(\src{\oplus};\src{m_1};\src{m_2})$ (unfold$\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$) $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$ The specialization of $\text{IH}_2$ is as earlier, but with the stack $\trg{[\cdot]}$. As for $\text{IH}_1'$ , the specialization uses the stack $\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot]}$: $\trg{n_1',n_2'}$ : Number $\text{IH}_2'$ : $\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{2}'}}\comp{\src{S}}{\trg{T}}{\src{m_{2}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{,}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}$ $\text{IH}_1'$ : $\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{\rtexpr{,[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{1}'}}\comp{\src{S}}{\trg{T}}{\src{m_{1}}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{\rtexpr{,[\cdot];\emptyset}{bop\ \oplus,\overline{c},[\cdot]}}$ $\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}$ The rest now follows as expected. It's exactly the same as in the previous proof attempt, but this time, the assumptions apply, making the proof go through.

This statement is finally general enough so that we can prove it with an induction. A few issues are left from a "logical" perspective: First, the lemma statement is hardcoded to use an empty heap $\trg{\emptyset}$, so it is not applicable for cases where we want a non-empty heap. The solution to this is to reformulate the lemma, generalizing on heaps. The proof stays exactly the same up to instantiating the heap in the induction hypotheses. Second, the lemma statement is also hardcoded to executions that do not emit any events whatsoever, i.e., non-crashing executions. Likewise, the final runtime state should be generalized to be either $\src{\triangleright\ v}$ (as before) or $\src{\triangleright} \text{⚡}$ (crash). To do so, it is necessary to define a compiler on runtime states $\src{r}$ giving runtime programs $\trg{r}$.

These technical complications are not interesting enough to warrant inclusion in this blog-post, however, it should be at least mentioned that the given forward simulation statement is not the most general/usable one. For example, it can (and probably should) be further generalized by accounting for arbitrary substitutions into $\src{e}$. The interested reader is invited to take a look at the mechanization, where these details are being taken care of.

Backward Simulation

While the intuition for backward simulation is "the compiled version does steps, so does the source", we need to constrain programs to safe programs. This is a reasonable constraint, however, since practical compiler implementations include, e.g., a typechecker that ensures that only safe programs are compiled. As long as the compiler for values is injective and the execution of $\trg{T}$ deterministic (normalforms are unique), backward simulation is implied by forward simulation:

Lem. Backward Simulation$$\begin{array}{l}\text{If }\src{\Gamma\vdash e:\tau}\\\text{and }\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{,[\cdot]}\trg{\nstepsarrow{}{n}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}},\\\text{then }\exists\src{n'},\src{\rtexpr{}{e}\nstepsarrow{}{n'}\rtexpr{}{v}}\end{array}$$
Proof.
Given a well-typed program, there is an execution that either produces a value or crashes.

Normal forms are unique and the number of steps necessary to reach them is always the same.

Have Comment Want $H_{\src{S}-\text{checks}}$ : $\src{\Gamma}\vdash\src{e}:\src{\tau}$ $H_{\trg{T}-exec}$ : $\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n}}\comp{\src{S}}{\trg{T}}{\src{v}}\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ $\exists\src{n'},\src{\rtexpr{}{e}\nstepsarrow{}{n'}\rtexpr{}{v}}$ (apply 'typesafety' on $H_{\src{S}-\text{checks}}$) (only showing the case with value as result) $\src{v'}$ : $\src{S}$-value $H_{\src{S}-exec}$ : $\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v'}}$ $\trg{n'}$ : Number (apply forward sim. on $H_{\src{S}-exec}$) $H_{\trg{T}-exec}'$ : $\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n'}}\comp{\src{S}}{\trg{T}}{\src{v'}}\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}$ (by determinism on $H_{\trg{T}-exec}$ and $H_{\trg{T}-exec}'$) $H_{eq1}$ : $\trg{n}=\trg{n'}$ $H_{eq2}$ : $\comp{\src{S}}{\trg{T}}{\src{v}}_v=\comp{\src{S}}{\trg{T}}{\src{v'}}_v$ (apply 'value compiler is injective' on $H_{eq2}$) $H_{eq3}$ : $\src{v}=\src{v'}$ (let $\ \src{n'}=\src{n}$) $\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}}$ (goal is solved by $H_{\src{S}-exec}$) The other case left out (where the well-typed program yields a crash) will eventually contradict with the fact that it is assumed that it does not crash in the target. i.e., apply a general form of forward simulation that also accounts for crashing programs, obtaining that the program crashes in the target, contradicting (due to deterministic semantics of $\trg{T}$) the assumed fact that it does produce a value in the target.

Backtranslation of Traces

The goal is to complete the earlier proofsketch of RTP . For this, it is necessary to build a source-level context $\src{C}$ that generates the same trace as the given target-level execution. Given the simple setup, there are only four possible cases:

Nothing crashes.
After returning from the component, the context fails.
The component fails.
The context fails before calling the component.

With the information on the trace, it is relatively easy to define the backtranslation, generating an $\src{S}$-level context from the trace:

$$ \begin{aligned} \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ n_0} \cdot \trg{Ret\ !\ n_1} \cdot \trg{End\ v} \right) & = & \src{(y;}\uparrow_v(\trg{n_0})\src{;}\uparrow_v(\trg{v})\src{)} \\ \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ n_0} \cdot \trg{Ret\ !\ n_1} \cdot \trg{⚡} \right) & = & \src{(y;}\uparrow_v(\trg{n_0})\src{; abort())} \\ \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ n_0} \cdot \trg{⚡} \right) & = & \src{(y;}\uparrow_v(\trg{n_0})\src{; 42)} \\ \uparrow \left( \trg{Start} \cdot \trg{⚡} \right) & = & \src{(y; abort(); 42)} \\ \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ b} \cdot \trg{⚡} \right) & = & \src{(y;abort(); 42)} \\ \end{aligned} $$

(** The backtranslation generates an S level context from a trace. *)
Definition bt (As : tracepref) : option S.LinkageContext :=
  match As with
  | Sstart ::
    Scrash :: nil => Some(S.LContext "y" (S.Xabort) S.Xabort)
  | Sstart ::
    Scall Qctxtocomp (inl v__pre) ::
    Scrash :: nil => Some(S.LContext "y" (S.Xval(S.Vnat v__pre)) (S.Xval(S.Vnat 5)))
  | Sstart ::
    Scall Qctxtocomp (inl v__pre) ::
    Sret Qcomptoctx _ ::
    Scrash :: nil => Some(S.LContext "y"
                          ((S.Xval (S.Vnat v__pre)))
                          S.Xabort)
  | Sstart ::
    Scall Qctxtocomp (inl v__pre) ::
    Sret Qcomptoctx _ ::
    Send v__end :: nil => Some(S.LContext "y" (S.Xval(S.Vnat v__pre)) (S.Xval(S.vs v__end)))
  | Sstart ::
    Scall Qctxtocomp (inr v__pre) ::
    _ => Some(S.LContext "y" (S.Xabort) (S.Xval(S.Vnat 5)))
  | _ => None
  end%string
.

$\uparrow_v$ is just the inverse of the value compiler . Now, suppose we have the component $\src{(x;var\ x+1)}$. Because of the way the static typing of $\src{S}$ is setup, it is impossible to call it with a boolean value. As established earlier, the compiler inserts wrapper code to ensure this static property dynamically, since $\trg{T}$ does not have any static typing, but we still want to forbid certain $\trg{T}$-level contexts to call our component. The compiled version of the component is $\trg{(x;quote\ [abort(),[\cdot]],get\ x,has\ \mathbb{N},negb,if,get\ x,push\ 1,bop\ +,[\cdot])}$. Consider linking and running it with the context $\trg{(y;true;get\ y)}$. The resulting trace is $\trg{Start\cdot Call\ ?\ true\cdot ⚡}$. Now, we "want" a source-level context that gives "the same" trace in order to prove robust preservation . But, as (re-)established just now, language $\src{S}$ is statically typed and ensures type-safety that way, so we cannot construct a context that exhibits the same behavior without violating well-typing, i.e., the program resulting from linking the component and the context generated by the backtranslation is not well-typed. The backtranslation tries to detect these ill-typed contexts and generates an $\src{S}$-level context that crashes immediately, i.e., $\src{(y;abort();42)}$. Unfortunately, linking and running this gives a different trace, since the component is now not called at all: $\src{Start\cdot ⚡}$.

So, we've got a situation where we have a target execution giving trace $\trg{Start\cdot Call\ ?\ true\cdot ⚡}$ and a source execution giving trace $\src{Start\cdot ⚡}$. We want these traces to be considered equal, because in both of them, the component-level code does not run at all, either the wrapper or the context resulting from the backtranslation will halt program execution.

Moreover, this situation gives rise to another technicality. Consider the component $\src{(x;abort())}$ and compile, link, and run it with context $\trg{(y;push\ 1;get\ y)}$. Clearly, the resulting trace is $\trg{Start\cdot Call\ ?\ 1\cdot ⚡}$. But, this time, the backtranslation generates context $\src{(y;val\ 1;var\ y)}$, the resulting $\src{S}$-level program is obviously well-typed. In the previously described situtation, we also obtain a trace where the component crashes, but here, code of the component is actually being run!

Anyhow, we proceed to define what it means for traces to be considered equal:

$\begin{array}{ ccc } \src{v_0}=\uparrow_v(\trg{v_0}) & \src{v_1}=\uparrow_v(\trg{v_1}) & \src{v_2}=\uparrow_v(\trg{v_2})\end{array}$ $\trg{Start} \cdot \trg{Call\ ?\ v_0} \cdot \trg{Ret\ !\ v_1} \cdot \trg{End\ v_2} \approx \src{Start} \cdot \src{Call\ ?\ v_0} \cdot \src{Ret\ !\ v_1} \cdot \src{End\ v_2}$ $\text{\footnotesize(trace-eq-ok)}$ $\begin{array}{ cc } \src{v_0}=\uparrow_v(\trg{v_0}) & \src{v_1}=\uparrow_v(\trg{v_1})\end{array}$ $\trg{Start} \cdot \trg{Call\ ?\ v_0} \cdot \trg{Ret\ !\ v_1} \cdot \trg{⚡} \approx \src{Start} \cdot \src{Call\ ?\ v_0} \cdot \src{Ret\ !\ v_1} \cdot \src{⚡}$ $\text{\footnotesize(trace-eq-fail0)}$ $\begin{array}{ c } \src{v_0}=\uparrow_v(\trg{v_0})\end{array}$ $\trg{Start} \cdot \trg{Call\ ?\ v_0} \cdot \trg{⚡} \approx \src{Start} \cdot \src{Call\ ?\ v_0} \cdot \src{⚡}$ $\text{\footnotesize(trace-eq-fail1)}$ $\trg{Start} \cdot \trg{⚡} \approx \src{Start} \cdot \src{⚡}$ $\text{\footnotesize(trace-eq-fail2)}$ $\trg{Start} \cdot \trg{Call\ ?\ v} \cdot \trg{⚡} \approx \src{Start} \cdot \src{⚡}$ $\text{\footnotesize(trace-eq-fail3)}$

(** We need a different kind of trace equality than strict equality.
    The backtranslated context may fail early if the component fails, e.g.,
    a context tries to pass a `bool` and the compiler wrapper fails.
    This, in turn, would mean that the backtranslation cannot generated a well-typed S context,
    so the only option is to abort. However, the abort then happens prior to
    calling into the component, so there is mismatch in the generated traces
    that we "need" to account for by using a different kind of "equality".
*)
Inductive trace_eq : tracepref -> tracepref -> Prop :=
| empty_eq : trace_eq nil nil 
| nocrash_eq : forall (v0 v1 v2 : nat + bool),
    trace_eq (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Send v2 :: nil)
             (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Send v2 :: nil) 
| crashctx1_eq :
    trace_eq (Sstart :: Scrash :: nil)
             (Sstart :: Scrash :: nil) 
| crashctx2_eq : forall (v : nat + bool),
    trace_eq (Sstart :: Scall Qctxtocomp v :: Scrash :: nil)
             (Sstart :: Scrash :: nil) 
| crashcomp_eq : forall (v : nat + bool),
    trace_eq (Sstart :: Scall Qctxtocomp v :: Scrash :: nil)
             (Sstart :: Scall Qctxtocomp v :: Scrash :: nil) 
| crashctx3_eq : forall (v0 v1 : nat + bool),
    trace_eq (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Scrash :: nil)
             (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Scrash :: nil) 
.
#[local]
Infix "≂" := (trace_eq) (at level 81).
Local Hint Constructors trace_eq : core.

For technical reasons concerned with this blog-post, we'll need an axiom about this equality:

Def. Trace Equality Preserves
$$ \text{If }\trg{\trace}\approx\src{\trace},\text{ then }\trg{\trace}\in\pi\leftrightarrow\src{\trace}\in\pi $$

Extending Robust Trace Preservation with Trace Relation

Recall the definition of RTP and the first few proof steps we did. As of now, the definition does not account for custom trace equalities. That's why the definition needs an extension to account for this, as done in existing work ( Citation: Abate et.al. 2021 Abate, Carmine and Blanco, Roberto and Ciobâcă, Ştefan and Durier, Adrien and Garg, Deepak and Hriţcu, Cătălin and Patrignani, Marco and Tanter, Éric and Thibault, Jérémy. 2021. An Extended Account of Trace-Relating Compiler Correctness and Secure Compilation. ) .

For any property $\ \pi\ $ and $\ \src{S}$-level program $\ \src{p}\ $, if $\ \forall\src{C}\ \src{\trace}\ \src{r}, \src{prog\ C\ p}\progarrow{\src{\trace}}{}\ \src{r} \implies \src{\trace}\in\pi$, then $\ \forall\trg{C}\ \trg{\trace}\ \trg{r},\ \trg{prog\ C\ }\gamma(\src{p})\progarrow{\trg{\trace}}\trg{r}\implies \src{\trace}\approx\trg{\trace}\text{ and }\trg{\trace}\in\pi$.

(** We define a "weakened" version of RSC that has a different notion of trace equality. *)
Definition rscwt (cc : compiler__ST) :=
  forall (p : S.symbol) (ctx__T : T.LinkageContext) (As : tracepref),
    forall r, T.progstep (T.plug' ctx__T (cc p)) As r ->
    exists r' (ctx__S : S.LinkageContext) As',
      trace_eq As As' /\
      S.progstep (S.plug' ctx__S p) As' r'
.
Notation "'[' '|-' cc ':' 'rscwt' ']'" := (rscwt cc) (at level 81, cc at next level).

The given definition unfolds robust preservation , because it needs to relate the traces hidden "inside it", albeit this statement is to be taken with a grain of salt, since the qantifiers need to change. In the original definition of robust preservation, traces where only quantified inside the robust satisfaction parts, while now at least the scope of the source-level trace leaks into the target-level robust satisfaction in order to relate the source with the target trace. Ultimately, this is the definition that we will prove soon.

Back to Backtranslation

While we have seen the definition for the backtranslation , we do not - yet - know wether it is correct or not. Intuitively, a target-level execution with whatever context and some compiled component, we want to identify a source-level context (this is done with the backtranslation $\uparrow$) which successfully links with the component and yields a trace upon execution that relates to the given one with our custom trace equality $\approx$.

Lem. Backtranslation Correctness$$\begin{array}{l}\text{If }\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{\trace}}\trg{r},\text{ then }\exists \src{C}\ \src{r}\ \src{\trace}, \src{prog\ C\ p}\progarrow{\src{\trace}}\src{r} \text{ and } \trg{\trace}\approx\src{\trace}\end{array}$$
Proof Sketch.
First, perform a case-analysis on the compiler output, which either yields the compiled component code or a singleton $\trg{abort}$ commandlist. We focus in the following on the first. Here, there are four cases, i.e., (1) all ok, (2) context fails after component is done, (3) component fails, (4) or context fails. The failing cases are all somewhat similar, so we chose to sketch only case (1) and case (3).

1st Case. Note that we have assumed an execution that yields a trace of shape $\trg{Start\cdot Call\ ?\ v_0\cdot Ret\ !\ v_1\cdot End\ v_2}$. Instantiate existential with result of backtranslation and trace $\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})$. Furthermore, split the goal into the execution and trace equality part, since it's a conjunction.

Have Comment Want ... : ... (equality part) (by definition) $\trg{Start\cdot Call\ ?\ v_0\cdot Ret\ !\ v_1\cdot End\ v_2}\approx\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})$

Have Comment Want (execution part) $\src{p}$ : $\src{S}$-level Component $H_{\src{checks}}$ : $\src{x:\nat\vdash p:\tau}$ (compiler does not fail, so it must be well-typed) $\trg{v_0},\trg{v_1},\trg{v_2}$ : $\trg{T}$-level Values $\trg{C}$ : $\trg{T}$-level Context $\trg{r}$ : $\trg{T}$-level Runtime Program $H_{\trg{exec}}$ : $\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{Start}\cdot\trg{Call\ ?\ v_0}\cdot\trg{Ret\ !\ v_1}\cdot\trg{End\ v_2}}\trg{r}$ $\src{prog\ (y;}\uparrow_v(\trg{v_0})\src{;}\uparrow_v(\trg{v_2})\src{)\ p}\progarrow{\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})}\comp{\src{S}}{\trg{T}}{\trg{r}}_r$ (invert $\ H_{\trg{exec}}$, i.e., compute the execution up to the call of the component) $n$ : Number $H_{wrap}$ : $\trg{\rtexpr{\Psi;x\mapsto v_0}{}}\operatorname{wrap}(\trg{x})\trg{,}\comp{\src{S}}{\trg{T}}{\src{p}},\trg{[\cdot]}\nstepsarrow{}{n}\trg{\rtexpr{v_1,\Psi;x\mapsto v_0}{[\cdot]}}$ (further computation requires case analysis on wether $\operatorname{wrap}(\trg{x})\ $ fails or not) (wrapper fails) (computation contradicts shape of the trace, which does not fail) $H_{\trg{p}}$ : $\trg{\rtexpr{\Psi;x\mapsto v_0}{}}\comp{\src{S}}{\trg{T}}{\src{p}},\trg{[\cdot]}\nstepsarrow{}{n-5}\trg{\rtexpr{v_1,\Psi;x\mapsto v_0}{[\cdot]}}$ (wrapper not failing) (back-sim. on $\ H_{\trg{p}}\ $ and $\ H_{\src{checks}}$) n' : Number $H_{\src{exec}}$ : $\src{\rtexpr{}{p}}[\text{subst}\ \src{x}\ \text{for}\ \uparrow_v(\trg{v_0})]\nstepsarrow{}{n'}\src{\rtexpr{}{}}\uparrow(\trg{v_1})$ $\src{prog\ (y;}\uparrow_v(\trg{v_0})\src{;}\uparrow_v(\trg{v_2})\src{)\ p}\progarrow{\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})}\comp{\src{S}}{\trg{T}}{\trg{r}}_r$ (compute goal up to component call) $\exists n'_0,\src{\rtexpr{}{p}}[\text{subst}\ \src{x}\ \text{for}\ \uparrow_v(\trg{v_0})]\nstepsarrow{}{n'_0}\src{\rtexpr{}{}}\uparrow(\trg{v_1})$ (by $\ H_{\src{exec}}$) (the context executions are trivial)

3rd Case. Run the program right up to the wrapper. Then case-analysis on the wether the wrapper fails or not.

does not fail: proceed as in case 1.
does fail: follows by computation.

tedious exercise. see the details in the mechanization.

Final Proof

We conclude the post with the final proof.

Lem. RTP$$\begin{array}{l}\text{For any property } \pi \text{ and } \src{S}\text{-level program } \src{p}, \text{ if } \forall\src{C}\ \src{\trace}\ \src{r}, \src{prog\ C\ p}\progarrow{\src{\trace}}{}\ \src{r} \implies \src{\trace}\in\pi,\\\text{ then } \forall\trg{C}\ \trg{\trace}\ \trg{r}, \trg{prog\ C\ }\gamma(\src{p})\progarrow{\trg{\trace}}\trg{r}\implies \trg{\trace}\approx\src{\trace}\text{ and }\trg{\trace}\in\pi.\end{array}$$

Have Comment Want $\pi$ : $\text{Property}$ $\src{p}$ : $\src{S}\text{-level component}$ $H_1$ : $\begin{array}{l}\forall\src{\trace}\in\text{Trace},\\\forall\src{r}\in\text{Runtime Expressions},\\\forall\src{C}\in\src{S}\text{-level program},\\\text{if }\src{prog\ C\ p}\progarrow{\src{\trace}}\src{r},\\\text{then }\src{\trace}\in\pi\end{array}$ $\begin{array}{l}\forall\trg{C}\in\trg{T}\text{-level program},\\\forall\trg{\trace}\in\text{Trace},\\\forall\trg{r}\in\text{Runtime Program},\\\text{if }\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{\trace}}\trg{r},\\\text{then }\trg{\trace}\approx\src{\trace}\text{ and }\trg{\trace}\in\pi\end{array}$ $\trg{\trace}$ : Trace (assume from goal) $\trg{C}$ : $\trg{T}\text{-level context}$ $H_{\trg{T}\text{-exec}}$ : $\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{\trace}}\trg{r}$ $\trg{\trace}\approx\src{\trace}\wedge\trg{\trace}\in\pi$ (use backtranslation correctness on $H_{\trg{T}\text{-exec}}$) $\src{C}$ : $\src{S}$-level context $\src{r}$ : Runtime program $\src{\trace}$ : Trace $H_{\src{exec}}$ : $\src{prog\ C\ p}\progarrow{\src{\trace}}\src{r}$ $H_{eq}$ : $\trg{\trace}\approx\src{\trace}$ (follows from $H_{eq}$, $H_1$ and $H_{\src{exec}}$, ) (relying on the trace equality axiom)

As a last note, RTP variants usually also have a property-free characterization which can be more elegant to do proofs with. Essentially, you almost always just have to construct the source context and then argue that there exists some kind of source-level execution that behaves similar to the given target-level execution. One does not need the trace equality axiom in the property-free version. Also note that it may be possible to lower this axiom to a lemma whenever one is interested in a concrete property and wants to prove the compiler secure for that concrete property alone. I refrained from presenting it here for "pedagogical reasons" (the post is already long enough).

You can find the Coq development containing all the details here. Thanks for reading and feel free to ping me on socials (preferrably ) for questions! 😅

References

WG14/N1256 (2007)

WG14/N1256. (2007). Programming languages — C [Committee Draft]. (SO/IEC 9899:TC3) WG14/N1256.

Xu, Lu, Du, Ding, Li, Wu, Payer & Mao (2023)

Xu, J., Lu, K., Du, Z., Ding, Z., Li, L., Wu, Q., Payer, M. & Mao, B. (2023). Silent Bugs Matter: A Study of Compiler-Introduced Security Bugs. In 32nd USENIX Security Symposium (USENIX Security 23) (pp. 3655-3672).

Patterson & Ahmed (2019)

Patterson, D. & Ahmed, A. (2019). The next 700 Compiler Correctness Theorems (Functional Pearl). Proc. ACM Program. Lang., 3(ICFP). DOI: 10.1145/3341689

Abate, Blanco, Ciobâcă, Durier, Garg, Hriţcu, Patrignani, Tanter & Thibault (2021)

Abate, C., Blanco, R., Ciobâcă, Ş., Durier, A., Garg, D., Hriţcu, C., Patrignani, M., Tanter, É. & Thibault, J. (2021). An Extended Account of Trace-Relating Compiler Correctness and Secure Compilation. ACM Trans. Program. Lang. Syst., 43(4). DOI: 10.1145/3460860

Introduction to Formal Compiler Security

Table of Contents

Formal Compiler Security

Preliminaries

Why formal methods?

What are properties?

(Robust) Satisfaction

Secure Compilation

$\src{S}$ Definitions

$\src{S}$ Syntax

$\src{S}$ Static Semantics

$\src{S}$ Dynamic Semantics

$\trg{T}$ Definitions

$\trg{T}$ Syntax

$\trg{T}$ Dynamic Semantics

Compiler from $\src{S}$ to $\trg{T}$

Simulations

Forward Simulation

Backward Simulation

Backtranslation of Traces

Extending Robust Trace Preservation with Trace Relation

Back to Backtranslation

Final Proof

References