The CompCert verified compiler

Commented Coq development

Version 2.1, 2013-10-28

Introduction

CompCert is a compiler that generates PowerPC, ARM and x86 assembly code from CompCert C, a large subset of the C programming language. The particularity of this compiler is that it is written mostly within the specification language of the Coq proof assistant, and its correctness --- the fact that the generated assembly code is semantically equivalent to its source program --- was entirely proved within the Coq proof assistant.

High-level descriptions of the CompCert compiler and its proof of correctness can be found in the following papers (in increasing order of technical details):

This Web site gives a commented listing of the underlying Coq specifications and proofs. Proof scripts are folded by default, but can be viewed by clicking on "Proof". Some modules (written in italics below) differ between the three supported target architectures. The PowerPC versions of these modules are shown below; the ARM and x86 versions can be found in the source distribution.

This development is a work in progress; some parts have substantially changed since the overview papers above were written.

The complete sources for CompCert can be downloaded from the CompCert Web site.

This document and the CompCert sources are copyright 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 Institut National de Recherche en Informatique et en Automatique (INRIA) and distributed under the terms of the following license.

Table of contents

General-purpose libraries, data structures and algorithms

Definitions and theorems used in many parts of the development

Source, intermediate and target languages: syntax and semantics

Compiler passes

Pass Source & target Compiler code Correctness proof
Pulling side-effects out of expressions;
fixing an evaluation order
CompCert C to Clight SimplExpr SimplExprspec
SimplExprproof
Pulling non-adressable scalar local variables out of memory Clight to Clight SimplLocals SimplLocalsproof
Simplification of control structures;
explication of type-dependent computations
Clight to Csharpminor Cshmgen Cshmgenproof
Stack allocation of local variables
whose address is taken;
simplification of switch statements
Csharpminor to Cminor Cminorgen Cminorgenproof
Recognition of operators
and addressing modes
Cminor to CminorSel Selection
SelectOp
SelectLong
Selectionproof
SelectOpproof
SelectLongproof
Construction of the CFG,
3-address code generation
CminorSel to RTL RTLgen RTLgenspec
RTLgenproof
Recognition of tail calls RTL to RTL Tailcall Tailcallproof
Function inlining RTL to RTL Inlining Inliningspec
Inliningproof
Postorder renumbering of the CFG RTL to RTL Renumber Renumberproof
Constant propagation RTL to RTL Constprop
ConstpropOp
Liveness
Constpropproof
ConstproppOproof
Common subexpression elimination RTL to RTL CSE
CombineOp
CSEproof
CombineOpproof
Register allocation (validation a posteriori) RTL to LTL Allocation Allocproof
Branch tunneling LTL to LTL Tunneling Tunnelingproof
Linearization of the CFG LTL to Linear Linearize Linearizeproof
Removal of unreferenced labels Linear to Linear CleanupLabels CleanupLabelsproof
Laying out the activation records Linear to Mach Stacking
Bounds
Stacklayout
Stackingproof
Emission of assembly code Mach to Asm Asmgen Asmgenproof0
Asmgenproof1
Asmgenproof

Type systems

Trivial type systems are used to statically capture well-formedness conditions on some intermediate languages.

All together


Xavier.Leroy@inria.fr