Secure Compilation
Table of Contents
Secure Compilation
In this post, I want to shed some light on what, why, and how to do secure compilation. It's more on the light-weight side of things, so if you are already familiar with parts of the topic, you likely don't want to read this.
Computer programs are written in text-form by programmers. This text needs to be processed in some way, which is referred to as "compilation". We can view this as translating from one language to another. For example, you may write down code in C, which subsequently the compiler translates to some data-representation that your processor can understand, referred to as machine language. So, abstractly, compilers are just functions which we'll write down more or less handwavingly as $\llbracket\src{\bullet}\rrbracket^{\src{S}\to\trg{T}}$. The idea is that $\llbracket\src{\bullet}\rrbracket^{\src{S}\to\trg{T}}$ is a compiler from language $\src{S}$ to language $\trg{T}$.
Programming languages usually have some kind of formal or at least informal specification. This is necessary to ensure functional correctness of compilers, so that they faithfully translate from e.g. $\src{S}$ to $\trg{T}$. Formally, we can phrase this correctness in terms of whole programs as follows:
$$ \forall \src{w}\in\src{P}, \forall \bm{t}\in\text{Outputs}, \text{if } \src{w}\to\bm{t} \text{, then } \llbracket \src{w}\rrbracket^{\src{S}\to\trg{T}}\to\bm{t} $$
Hereby, $\bm{t}$ is some final "state" of the program, which may simply be a value, like 42
. So, given an $\src{S}$ program evaluates under the semantics of $\src{S}$ to some output $\bm{t}$, we want that it's compiled version $\llbracket \src{p}\rrbracket^{\src{S}\to\trg{T}}$ evaluates to the same configuration. You may think that configurations are different, given that the programs run under different semantics, and you'd be correct. For sake of simplicity, let us assume that programming languages $\src{S}$ and $\trg{T}$ have a semantics that yields an output of the same shape.
In our setting, we require the output to contain a trace. Such properties are sets of all such traces that fulfill a certain condition, like "No use after free" or "The letter x
is printed exactly once". Thus, a program satisfies a property $\pi$ if the trace it's output contains is part of $\pi$. Formally: $$ \src{p}\text{ satisfies }\pi \text{ iff given }\forall\bm{t}\in\text{Outputs},\src{p}\to\bm{t}\text{ then }\bm{t}\in\pi $$ With that, we can finally talk about security and why correctness is inadequate as a criterion for it. First of all, security is always about violating a particular kind of property. We never want something bad to happen. So, we may think to phrase a "secure" compiler as simply preserving property satisfaction: $$ \forall \pi\in\text{Properties}, \forall \src{w}\in\src{P}, \text{if } \src{w} \text{ satisfies }\pi \text{, then } \llbracket \src{w}\rrbracket^{\src{S}\to\trg{T}}\text{ satisfies }\pi $$ In the setting of security, there is always this distinction between "the good people" and "the bad people", the latter commonly referred to as "attackers".
Security researchers will always have to consider a so-called threat-model, which simply describes the power of an attacker. What could be the most powerful attacker in terms of programs? Considering a program $p$ written in some programming language, the most powerful attacker is able to interact with $p$ in any way that the programming language allows. For example, a C program may call some library function, like strcmp
. strcmp
may have been swapped out by some malicious attacker. Instead of just comparing strings, it overwrites some parts of your memory by means of pointer arithmetic on the pointer you provide as an argument. In this simple example, the attacker even controls parts of the internal state that we have, which is something pretty difficult to protect us against. Any attack carried out by an evil-person will always be written as some kind of exploit, some other program that is linked, provides inputs, or overwrites stuff by means of what the processor's language allows them to overwrite. So, the most powerful attacker can run arbitrary code at the same language level. This is not something that the correctness criterion captures. It only talks about whole programs, programs without open definitions and - therefore - without any interaction with the outside world.
Programs are more often than not developed as one or multiple modules, which depend on other modules to ultimately yield a whole program. Whole programs do not have any open definition. For example, the following is a valid partial C program:
int foo(void);
int main(int argc, char** argv) {
return foo();
}
The program declares the symbol foo
, but never ends up defining it. That is, unless we compile it together with another C file containing the following:
int foo(void) {
return 5;
}
Linking partial programs together to build up (eventually) a whole program usually happens on the target level. In the above example, you could compile both files with gcc
, which will ultimately invoke the linker ld
that glues the assembly of compiled partial programs together. Another option may be to simply take the definition from the one file and move it to the other. We'll consider linking formally as a binary operator $\bowtie$. Now, we can say a program is robust with respect to an arbitrary attacker if it satisfies a property of interest. We've already seen satisfaction, and we can similarily define this new notion of robust satisfaction: $$ \text{In language }\src{S},\ \src{p}\src{\text{ robustly satisfies }}\pi \text{ iff given } $$ $$ \forall\bm{t}\in\text{Outputs},\forall K\in\src{P},\src{K\bowtie p}\to\bm{t}\text{ then }\bm{t}\in\pi $$
With that, we simply take compiler correctness and amend the statement to make use of this more security-centric notion of property satisfaction: $$ \forall \pi\in\text{Properties}, \forall \src{p}\in\src{P}, \text{if } \src{p} \src{\text{ robustly satisfies }}\pi, $$ $$ \text{then } \llbracket \src{p}\rrbracket^{\src{S}\to\trg{T}}\trg{\text{ robustly satisfies }}\pi $$
At this point, we may ask ourselves: Why? And the quesiton is relevant, given there exist several, practical source code instrumentations that will make sure a certain property of interest is not violated: Softbound ( Citation: Nagarakatte et.al. 2009 Nagarakatte, Santosh and Zhao, Jianzhou and Martin, Milo MK and Zdancewic, Steve. 2009. SoftBound: Highly compatible and complete spatial memory safety for C. ) , CETS ( Citation: Nagarakatte et.al. 2010 Nagarakatte, Santosh and Zhao, Jianzhou and Martin, Milo MK and Zdancewic, Steve. 2010. CETS: compiler enforced temporal safety for C. ) , BaggyBounds ( Citation: Akritidis et.al. 2009 Akritidis, Periklis and Costa, Manuel and Castro, Miguel and Hand, Steven. 2009. Baggy Bounds Checking: An Efficient and Backwards-Compatible Defense against Out-of-Bounds Errors. ) , (Low-)FatPointers ( Citation: Kwon et.al. 2013 Kwon, Albert and Dhawan, Udit and Smith, Jonathan M and Knight Jr, Thomas F and DeHon, Andre. 2013. Low-fat pointers: compact encoding and efficient gate-level implementation of fat pointers for spatial safety and capability-based security. ) , DangSan ( Citation: Van Der Kouwe et.al. 2017 Van Der Kouwe, Erik and Nigade, Vinod and Giuffrida, Cristiano. 2017. Dangsan: Scalable use-after-free detection. ) , to cite a few. We can just use those to fix our code, right? The issue at hand with these approaches is manyfold. Either they fail to interact with other code at all or they simply trust library code to behave correctly. So, their threat-model is internal, the situation is similar to the security we get out of compiler correctness. Library code could, if the programming language allows it, just overwrite our component's memory with some arbitrary garbage. With a robustly preserving compiler you have reassurance that this cannot happen. Lastly, you have no guarantee these approaches are correct. They test this on a certain set of benchmarks and identify X% of memory safety issues. For sure, this goes some part of the mile, but certainly not the whole. If we want to have a truly secure program, we better go the whole mile. Pretty much the only way to establish 100% is by means of a mathematical proof.
Now, another question we may still ask is: Why all this formal nonsense and why do we really need the proof? The answer is surprisingly simple, but also intriguing. We want to get this right! We want to have a formal guarantee, a mathematical proof, that we have full-blown security, maybe not for all kinds of properties, but at least for a particular property or class of properties of interest. If we write a secure program to begin with, the compiler will make goddamn sure it stays secure at the target level. And writing a secure program may sound harder than it is: Rust enforces certain security guarantees by means of the type system and this renders any Rust program as secure with respect to those properties. If the compiler robustly preserves them, any compiled Rust program will benefit from the original robust satisfaction of the properties of interest.
Ok, finally, the how. How does one prove this? Obviously, first of all, both source and target language and the compiler itself need a mathematical specification that we can work with. I'm glossing over this as if it were that easy, but formalizing these three things is already quite a feat. Given that, proving a compiler robustly preserving requires quite a bunch of rather technical setup. I'll try to give an overview of how it works, but the precise details depend on the actual programming languages of consideration.
We look at the "robustly safe compiler" criterion. On the left, you can see the assumptions that we are allowed to work with, on the right, the current goal. Let's make things easier notationally by immediately assuming anything that is quantified with $\forall$ or in front of a $\text{then}$: $$ \begin{array}{rlll} \pi&:&\text{Properties}&\\ \src{p}&:&\src{P}&\\ H_1&:&\src{p\text{ robustly satisfies }}\pi&\\ &&& \llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}\trg{\text{ robustly satisfies }}\pi\\ \end{array} $$ Unfolding the defition of $\text{ robustly satisfies }$: $$ \begin{array}{rlll} \pi&:&\text{Properties}&\\ \src{p}&:&\src{P}&\\ H_1&:&\forall\bm{t}\in\text{Outputs},&\\ &&\forall\src{K}\in\src{P},&\\ &&\text{if }\src{K\bowtie p}\to\bm{t}\text{, then }\bm{t}\in\pi&\\ \bm{t}&:&\text{Trace}&\\ &&& \forall\bm{t}\in\text{Outputs},\forall\trg{K}\in\trg{P},\\ &&& \text{if }\trg{K\bowtie}\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}\to\bm{t}\text{, then }\bm{t}\in\pi\\ \end{array} $$ We can again introduce anything that is universally quantified or a premise: $$ \begin{array}{rlll} \pi&:&\text{Properties}&\\ \src{p}&:&\src{P}&\\ H_1&:&\forall\bm{t}\in\text{Outputs},&\\ &&\forall\src{K}\in\src{P},&\\ &&\text{if }\src{K\bowtie p}\to\bm{t}\text{, then }\bm{t}\in\pi&\\ \bm{t}&:&\text{Trace}&\\ \trg{K}&:&\trg{P}&\\ H_2&:&\trg{K\bowtie}\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}\to\bm{t}&\\ &&& \bm{t}\in\pi\\ \end{array} $$ Now it's easy to see that we can apply $H_1$ as long as we can construct a $\src{K}$ from either $\bm{t}$ or $\trg{K}$. This is a crucial part of this whole criterion:
The compiler inserts countermeasures so that $\trg{\text{target level}}$ attackers can only do the same attacks that $\src{\text{source level}}$ attackers can do.
If $\src{S}=\trg{T}$, we can simply put $\trg{K}$ for $\src{K}$ and the only thing left to argue is that $\src{K\bowtie p}$ and $\trg{K\bowtie}\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}$ produce similar output. Even if the languages are different, it can still be a viable option to take $\trg{K}$ and go over it's syntactic structure and construct a similar behaving $\src{K}$. However, for $\src{S}$ and $\trg{T}$ very different, pretty much the only option at hand is $\bm{t}$. Here, one would make use of a so-called backtranslation, which takes a $\trg{T}$-level trace and translates that trace into an $\src{S}$-level context.
In practice, we also have some kind of filter operation hidden in there, which may mark $\bm{t}$ actions as context action or component action. That way, a context that trivially dissatisfies a property of interest can be ignored, i.e. we only consider properties inside components. A context that just immediately tries to deallocate a random address is certainly ill-behaving with respect to memory safety. Moreover, these are also not meaningful attackers, an attacker tries to trip up the component somehow, i.e. violating the property.
I hope this little intro has been helpful. That being said, I'm sure you can get quite a lot more out of the following papers:
- "Formal approaches to secure compilation: A survey of fully abstract compilation and related work" ( Citation: Patrignani et.al. 2019 Patrignani, Marco and Ahmed, Amal and Clarke, Dave. 2019. Formal approaches to secure compilation: A survey of fully abstract compilation and related work. )
- "Robustly safe compilation, an efficient form of secure compilation" ( Citation: Patrignani et.al. 2021 Patrignani, Marco and Garg, Deepak. 2021. Robustly safe compilation, an efficient form of secure compilation. )