Release 2.4, 2014-09-17
=======================

Language features:
- Support C99 compound literals  (ISO C99 section 6.5.2.5).
- Support "switch" statements over an argument of type "long long".

Code generation and optimization:
- Revised and improved support for single-precision floating-point
  arithmetic.  Earlier, all FP arithmetic was performed at double
  precision, with conversions to/from single precision as needed,
  in particular when loading/storing a single-precision FP number
  from/to memory.  Now, FP operations whose arguments are of type
  "float" are performed in single-precision, using the processor's
  single FP instructions.  Fewer conversions between double and
  single precision are generated.
- Value analysis and constant propagation: more precise treatment of
  comparisons against an integer constant.

Improvements in confidence:
- Full correctness proofs for the algorithms used in the runtime
  support library for conversions between 64-bit integers and
  floating-point numbers.

ARM port:
- Added support for Thumb2 instruction encoding (option -mthumb).
  Thumb2 is supported on ARMv7 and up, and produces more compact
  machine code.
- Exploit some VFPv3 instructions when available.
- Built-in function '__builtin_cntlz' (count leading zeros)
  renamed '__builtin_clz' for GCC / Clang compatibility.

PowerPC port:
- Refactored the expansion of built-in functions and
  pseudo-instructions so that it does not need to be re-done in
  cchecklink.
- Updated the cchecklink validator accordingly.
- More efficient code generated for volatile accesses to small data areas.
- Built-in function '__builtin_cntlz' (count leading zeros)
  renamed '__builtin_clz' for GCC / Clang compatibility.

IA32 port:
- Added built-in functions __builtin_clz and __builtin_ctz
  (count leading / trailing zeros).

Coq development:
- The memory model was extended with two new "chunks", Many32 and Many64,
  that enable storing any 32-bit value or 64-bit value using
  an abstract, not bit-based encoding, and reloading these values exactly.
  These new chunks are used to implement saving and restoring callee-save
  registers that can contain data of unknown types (e.g. float32 or float64)
  but known sizes.
- Refactored the library of FP arithmetic (lib/Floats.v) to support
  both 64- and 32-bit floats.


Release 2.3pl2, 2014-05-15
==========================

Usability:
- Re-added support for "__func__" identifier as per ISO C99.
- Re-added some popular GCC extensions to ISO C99:
     . alternate keywords __restrict, __inline__, etc, 
     . support for empty structs and unions
     . support '\e' escape in char and string literals, meaning ESC
- Do not assume that the preprocessor removed all comments.

Bug fixing:
- Fixed regression on initializers of the form  T x[N] = "literal";
  where T is a typedef for a character type.
- "asm" statements were causing syntax errors.
- Better handling of "extern" and "extern inline" function definitions.
- Internal error on some octal escape sequences in string literals.
- Parsing of "#pragma section" directives made more robust and
  with better error reporting.


Release 2.3, 2014-05-05
=======================
	
Language features:
- Support for C99 designated initializers. (ISO C99 section 6.7.8.)

Improvements in confidence:
- The parser is now formally verified against the ISO C99 grammar plus
  CompCert's extensions.  The verification proves that the parser
  recognizes exactly the language specified by the grammar, and that
  the grammar has no ambiguities.  For more details, see the paper
  "Validating LR(1) parsers" by Jacques-Henri Jourdan, François Pottier,
  and Xavier Leroy, ESOP 2012, http://dx.doi.org/10.1007/978-3-642-28869-2_20
- More theorems proved about float<->integer conversions.

Optimizations:
- Optimize "x != 0", "x == 0", "x != 1", and "x == 1" when x is known
  to be a boolean already, ranging over {0, 1, undef}.
- More systematic constant propagation in pass Selection, lightens
  the work of later RTL optimisations.
- IA32: recognize and use the "not" instruction.

Usability:
- Option "-timings" to print compilation times for various passes.
- Various tweaks in IRC graph coloring to reduce compilation time.
- IA32: add built-in functions for fused multiply-add
  (require a recent processor with FMA3 extensions).

Improvements in ABI conformance:
- New target platform: ARM with EABI "hard float" calling conventions
  (armhf in Debian's classification).
- IA32 and ARM: revised handling of "common" variables to conform with ABI.

Bug fixing:
- In -fbitfields emulation: "a->f" was not properly rewritten if "a"
  had "array of structs" type instead of "pointer to struct".
- Moved analysis of single-precision floats from RTLtyping to Machtyping.
  (RTLtyping was incorrectly rejecting some functions involving
  single-precision floats.)  Simplified LTL semantics and Allocation
  pass accordingly.
- Assignment to a l-value of "volatile float" type could cause
  an internal error in RTLtyping/Machtyping.
- The case __builtin_fabs applied to integers was missing in the
  C semantics and in C#minor generation.
- Fixed some type annotations on CompCert C expressions.  These
  annotations were incorrect but not in a way that impacted code
  generation.


Release 2.2, 2014-02-24
=======================

Major improvements:

- Two new static analyses are performed on the RTL intermediate form:
    . Value analysis, tracking constants, some integer range information,
      and pointer aliasing information.
    . Neededness analysis, generalizing liveness analysis to individual
      bits of integer values and to stack memory locations.

- Improved RTL optimizations, exploiting the results of these analyses:
    . Constant propagation can track constants across memory stores and loads.
    . Common subexpression elimination exploits nonaliasing information.
    . Dead code elimination can eliminate useless memory writes and 
      block copies, as well as integer operations that do not change
      the needed bits.
    . Redundant cast elimination is now performed globally (at
      function level) rather than locally on individual expressions.

- Experimental support for defining and calling variable-argument functions,
  including support for the <stdarg.h> interface.
  (Option -fvararg-calls, "on" by default.)

Language features:
- In "switch" statements, "default" cases can now appear anywhere, not
  just as the last case.
- Support for incomplete array as last field of a struct,
  as specified in ISO C 99.
- Support for declarations within 'for' loops, as specified in ISO C 99.
  (E.g. "for (int i = 0; i < 4; i++) ...")
- Revised semantics and implementation of _Alignas(N) attribute
  to better match those of GCC and Clang.
- Better tolerance for functions declared without prototypes
  (option -funprototyped, "on" by default).
- On PowerPC, support "far-data" sections
  (register-relative addressing with 32-bit offsets).

Improvements in ABI conformance:
- For x86/IA32, align struct fields of types "double" or "long long" to 4
  instead of 8, as prescribed by the x86 ELF ABI.
- For PowerPC and ARM, structs and unions returned as function results
  are now passed in integer registers if their sizes are small enough
  (<= 8 bytes for PowerPC, <= 4 bytes for ARM).

Usability:
- Revised parsing of command-line arguments to be closer to GCC and Clang.
  In particular, "ccomp -c foo.c -o obj/foo.o" now works as expected,
  instead of ignoring the "-o" option as in earlier CompCert versions.
- Recognize input files ending in .i and .p as C source files that
  must not be preprocessed.
- Warn for uses of the following GCC extensions to ISO C:
  zero-sized arrays, empty structs/unions, empty initializer braces.
- Option "-fno-fpu" to prevent the use of FP registers for some
  integer operations such as block copies.  (Replaces the previous
  "-fno-sse" option which was x86/IA32-specific, and extends it to
  PowerPC and ARM.)
- Option "-drtl" to record the RTL intermediate representation
  at every stage of optimization.  (Replaces "-dtailcall", "-dinlining",
  "-dconstprop", and "-dcse".)
- Add CompCert version number and command-line arguments as comments
  in the generated assembly files.

Other performance improvements:
- Recognize __builtin_fabs as a primitive unary operator instead of
  a built-in function, enabling more optimizations.
- PowerPC: shorter code generated for "&global_variable + expr".

Improvements in compilation times:
- More efficient implementation of Kildall's dataflow equation solver,
  reduces size of worklist and nomber of times a node is visited.
- Better OCaml GC settings significantly reduce compilation times
  for very large source functions.

Bug fixing:
- Fixed incorrect hypothesis on __builtin_write{16,32}_reversed.
- Fixed syntax error in __attribute__((__packed__)).
- Emit clean compile-time error for 'switch' over a value of 64-bit
  integer type (currently not supported).
- Recognize source files with .i or .p extension as C sources that
  should not be preprocessed.

Coq development:
- Removed propositional extensionality axiom (prop_ext).
- Suppressed the Mfloat64al32 memory_chunk, no longer needed.


Release 2.1, 2013-10-28
=======================

Language semantics:
- More precise modeling of not-a-numbers (NaNs) in floating-point
  arithmetic.
- The CompCert C language is now defined with reference to ISO C99
  instead of ISO C90 ("ANSI C") as before.  This affects mostly the
  wording of the reference manual.  However, the parsing of integer
  constants and character constants was revised to follow the ISO C99
  standard.

Language features:
- Support for _Alignas(N) attribute from ISO C 2011.
- Revised implementation of packed structs, taking advantage of _Alignas.
- Suppressed the pragma "packed", replaced by a struct-level attribute
  __packed__(params) or __attribute__(packed(params)).
- Fixed typing rules for __builtin_annot() to avoid casting arguments
  of small integer or FP types.

Performance improvements:
- Optimize integer divisions by positive constants, turning them into
  multiply-high and shifts.
- Optimize floating-point divisions by powers of 2, turning them
  into multiplications.
- Optimize "x * 2.0" and "2.0 * x" into "x + x".
- PowerPC: more efficient implementation of division on 64-bit integers.

Bug fixing:
- Fixed compile-time error when assigning a long long RHS to a bitfield.
- Avoid double rounding issues in conversion from 64-bit integers
  to single-precision floats.

Miscellaneous:
- Minor simplifications in the generic solvers for dataflow analysis.
- Small improvements in compilation times for the register allocation pass.
- MacOS X port updated to the latest XCode (version 5.0).


Release 2.0, 2013-06-21
=======================

Major improvements:

- Support for C types "long long" and "unsigned long long", that is,
  64-bit integers.  Regarding arithmetic operations on 64-bit integers,
  . simple operations are expanded in-line and proved correct;
  . more complex operations (division, modulus, conversions to/from floats)
    call into library functions written in assembly, heavily tested
    but not yet proved correct.

- The register allocator was completely rewritten to use an "end-to-end"
  translation validation approach, using a validation algorithm
  described in the paper "Validating register allocation and spilling"
  by Silvain Rideau and Xavier Leroy, Compiler Construction 2010,
  http://dx.doi.org/10.1007/978-3-642-11970-5_13
  This validation-based approach enables better register allocation, esp:
  . live-range splitting is implemented
  . two-address operations are treated more efficiently
  . no need to reserve processor registers for spilling and reloading.
  The improvements in quality of generated code is significant for
  IA32 (because of its paucity of registers) but less so for ARM and PowerPC.

- Preliminary support for debugging information.  The "-g" flag
  causes DWARF debugging information to be generated for line numbers
  and stack structure (Call Frame Information).  With a debugger like
  GDB, this makes it possible to step through the code, put breakpoints
  by line number, and print stack backtraces.  However, no information
  is generated yet for C type definitions nor for variables; therefore,
  it is not possible to print the values of variables.

Improvements in ABI conformance:
- For IA32 and ARM, function arguments of type "float"
  (single-precision FP) were incorrectly passed as "double".
- For PowerPC, fixed alignment of "double" and "long long" arguments
  passed on stack.

Improvements in code generation:
- More aggressive common subexpression elimination across some builtin
  function calls, esp. annotations.

Improvements in compiler usability:
- Option -fno-taillcalls to turn off tail-call elimination.
  (Some static analysis tools are confused by this optimization.)
- Reduced stack usage of the compiler by rewriting some key functions
  in tail-recursive style.
- Reduced memory requirements of constant propagation pass by forgetting
  compile-time approximations of dead variables.
- More careful elaboration of C struct and union types into CompCert C
  types, avoiding behaviors exponential in the nesting of structs.

Bug fixing:
- Fixed parsing of labeled statements inside "switch" constructs,
  which were causing syntax errors.
- The "->" operator applied to an array type was causing a type error.
- Nested conditional expressions "a ? (b ? c : d) : e" were causing
  a compile-time error if "c", "d" and "e" had different types.

Coq development:
- Adapted the memory model to the needs of the VST project at Princeton:
  . Memory block identifiers are now of type "positive" instead of "Z"
  . Strengthened invariants in the definition of memory injections
    and the specification of external calls.
- The LTL intermediate language is now a CFG of basic blocks.
- Suppressed the LTLin intermediate language, no longer used.


Release 1.13, 2013-03-12
========================

Language semantics:
- Comparisons involving pointers "one past" the end of a block are
  now defined.  (They used to be undefined behavior.)
  (Contributed by Robbert Krebbers).

Language features:
- Arguments to __builtin_annot() that are compile-time constants
  are now replaced by their (integer or float) value in the annotation
  generated in the assembly file.

Improvements in performance:
- ARM and PowerPC ports: more efficient access to function parameters
  that are passed on the call stack.
- ARM port; slightly better code generated for some indirect memory
  accesses.

Bug fixing:
- Fixed a bug in the reference interpreter in -all mode causing some
  reductions to be incorrectly merged.
- Wrong parsing of hexadecimal floating-point literals 0xMMMMpEEE.

Improvements in usability:
- Better error and warning messages for declarations of variables
  of size >= 2^32 bits.
- Reference interpreter: more efficient exploration of states in -all mode.

Coq development:
- More efficient implementation of machine integers (module Integers)
  taking advantage of bitwise operations defined in ZArith in Coq 8.4.
- Revised handling of return addresses in the Mach language
  and the Stacking and Asmgen passes.
- A number of definitions that were opaque for no good reason are now
  properly transparent.


Release 1.12.1, 2013-01-29
==========================

Ported to Coq 8.4pl1.  Otherwise functionally identical to release 1.12.


Release 1.12, 2013-01-11
========================

Improvements in confidence:
- Floating-point literals are now parsed and converted to IEEE-754 binary
  FP numbers using a provably-correct conversion function implemented on
  top of the Flocq library.

Language semantics:
- Comparison between function pointers is now correctly defined
  in the semantics of CompCert C (it was previously undefined behavior,
  by mistake).
- Bit-fields of 'enum' type are now treated as either unsigned or signed,
  whichever is able to represent all values of the enum.
  (Previously: always signed.)
- The "&&" and "||" operators are now primitive in CompCert C and are
  given explicit semantic rules, instead of being expressed in terms
  of "_ ? _ : _" as in previous CompCert releases.
- Added a "Ebuiltin" expression form (invocation of built-in function)
  to CompCert C, and a "Sbuiltin" statement form to Clight.
  Used it to simplify the encoding of annotations, memcpy, and volatile
  memory accesses.

Performance improvements:
- Better code generated for "&&" and "||" operators.
- More aggressive elimination of conditional branches during constant
  propagation, taking better advantage of inferred constants.

Language features:
- By popular demand, "asm" statements for inline assembly are now supported
  if the flag -finline-asm is set.  Use with extreme caution, as the
  semantic preservation proof assumes these statements have no effect
  on the processor state.

Internal simplifications and reorganization:
- Clight, Csharpminor, Cminor: suppressed the "Econdition" conditional
  expressions, no longer useful.
- Clight: a single loop form, the three C loops are derived forms.
- Clight: volatile memory accesses are materialized as builtin operations.
- Clight: removed dependencies on CompCert C syntax and semantics.
- New pass SimplLocals over Clight that replaces local scalar variables
  whose address is never taken by temporary, nonadressable variables.
  (This used to be done in Cminorgen.)
- Csharpminor: simplified semantics.
- Cminor: suppressed the "Oboolval" and "Onotbool" operators,
  which can be expressed in terms of "Ocmpu" at no performance costs.
- All languages: programs are now presented as a list of global definitions
  (of functions or variables) instead of two lists, one for functions
  and the other for variables.

Other changes:
- For compatibility with other C compilers, output files are now generated
  in the current directory, by default; output file name can be controlled
  with the -o option, somewhat like with GCC.
- Reference interpreter: better handling of volatile memory accesses.
- IA32/MacOS X: now supports referencing global variables defined in shared
  libraries; old hack for stdio is no longer needed.
- PowerPC/MacOS X: this port was removed, as recent version of MacOS X
  no longer support PowerPC.


Release 1.11, 2012-07-13
========================

Improvements in confidence:
- Floating-point numbers and arithmetic operations, previously axiomatized,
  are now implemented and proved correct in Coq, using the Flocq library
  of S. Boldo and G. Melquiond.

Language semantics:
- In accordance with ISO C standards, the signed division min_int / -1
  and the signed remainder min_int % -1 (where min_int is the smallest
  representable signed integer) now have undefined semantics and are
  treated as "going wrong" behaviors.
  (Previously, they were defined with results min_int and 0 respectively,
  but this behavior requires unnatural code to be generated on IA32 and
  PowerPC.)

Performance improvements:
- Function inlining is now implemented.  The functions that are inlined
  are those declared "inline" in the C source, provided they are not
  recursive.
- Constant propagation is now able to propagate the initial values of
  "const" global variables.
- Added option -ffloat-const-prop to control the propagation of
  floating-point constants; see user's manual for documentation.
- Common subexpression elimination can now eliminate memory loads
  following a memory store at the same location.
- ARM: make use of the "fcmpzd" and "fmdrr" instructions.

New tool:
- The "cchecklink" tool performs a posteriori validation of the
  assembling and linking phases.  It is available for PowerPC-EABI
  only.  It takes as inputs an ELF-PowerPC executable as produced
  by the linker, as well as .sdump files (abstract assembly) as
  produced by "ccomp -sdump", and checks that the executable contains
  properly-assembled and linked code and data corresponding to those
  produced by CompCert.

Other changes:
- Elimination of "static" functions and "static" global variables that
  are not referenced in the generated code.
- The memory model was enriched with "max" permissions in addition to
  "current" permissions, to better reason over "const" blocks and
  already-deallocated blocks.
- More efficient implementation of the memory model, resulting
  in faster interpretation of source files by "ccomp -interp".
- Added option "-falign-functions" to control alignment of function code.


Release 1.10, 2012-03-13
========================

Improvements in confidence:
- CompCert C now natively supports volatile types.  Its semantics fully
  specifies the meaning of volatile memory accesses.  The translation
  of volatile accesses to built-in function invocations is now proved correct.
- CompCert C now natively supports assignment between composite types
  (structs or unions), passing composite types by value as function
  parameters, and other instances of using composites as r-values, with
  the exception of returning composites by value from a function.
  (The latter remains emulated, using the -fstruct-return option.)
- PowerPC: removed the -fmadd option, not semantically-preserving
  in the strict sense.

Language features:
- Support for _Bool type from ISO C99.
- Support for _Alignof(ty) operator from ISO C 2011
  and __alignof__(ty), __alignof__(expr) from GCC.

Performance improvements:
- Improvements in instruction selection, especially for integer casts
  and their combinations with bitwise operations.
- Shorter, more efficient code generated for accessing volatile global
  variables.
- Better code generated for the && and || operators.
- More aggressive common subexpression elimination (CSE) of memory loads.
- Improved register allocation for invocations of built-ins,
  especially for annotations.
- In Cminor and down, make safe operators non-strict: they return Vundef
  instead of getting stuck.  This enables more optimizations.
- Cast optimization is no longer performed by a separate pass over
  RTL, but equivalent optimization is done during Cminor generation
  and during instruction selection.

Other improvements:
- PowerPC/EABI: uninitialized global variables now go in common (bss) section.
- PowerPC: work around limited excursion of conditional branch instructions.
- PowerPC: added __builtin_fnmadd() and __builtin_fnmsub().
- Reference interpreter: better printing of pointer values and locations.
- Added command-line options -Wp,<opt> -Wa,<opt> -Wl,<opt> to pass
  specific options to the preprocessor, assembler, or linker, respectively.
- More complete Cminor parser and printer (contributed by Andrew Tolmach).


Release 1.9.1, 2011-11-28
=========================

Bug fixes:
- Initialization of a char array by a short string literal was wrongly rejected
- Incorrect handling of volatile arrays.
- IA32 code generator: make sure that min_int / -1 does not cause a
  machine trap.

Improvements:
- Added language option -flongdouble to treat "long double" like "double".
- The reference interpreter (ccomp -interp) now supports 2-argument main
  functions (int main(int, char **)).
- Improved but still very experimental emulation of packed structs
  (-fpacked-structs)
- Coq->Caml extraction: extract Coq pairs to Caml pairs and Coq
  characters to Caml "char" type.


Release 1.9, 2011-08-22
=======================

- The reduction semantics of CompCert C was made executable and turned
  into a reference interpreter for CompCert C, enabling animation of
  the semantics.  (Thanks to Brian Campbell for suggesting this approach.)
  Usage is:  ccomp -interp [options] source.c
  Options include:
    -trace      to print a detailed trace of reduction steps
    -random     to randomize execution order
    -all        to explore all possible execution orders in parallel

- Revised and strengthened the top-level statements of semantic preservation.
  In particular, we now show:
  . backward simulation for the whole compiler without assuming
    a deterministic external world;
  . if the source program goes wrong after performing some I/O,
    the compiled code performs at least these I/O before continuing
    with an arbitrary behavior.

- Fixed two omissions in the semantics of CompCert C
  (reported by Brian Campbell):
  . Functions calls through a function pointer had undefined semantics.
  . Conditional expressions "e1 ? e2 : e3" where e2 and e3 have different
    types were missing a cast to their common type.

- Support for "read-modify-write" operations over volatiles
  (such as e++ or --e or e |= 1 where e has volatile type)
  through a new presimplification (flag -fvolatile-rmw, "on" by default).

- New optimization pass: Redundant Reload Elimination, which fixes up
  inefficiencies introduced during the Reload pass.  On x86, it increases
  performance by up to 10%.  On PowerPC and ARM, the effect is negligible.

- Revised handling of annotation statements.  Now they come in two forms:
    1. __builtin_annot("format", x1, ..., xN)
       (arbitrarily many arguments; no code generated, even if some
        of the xi's were spilled; no return value)
    2. __builtin_annot_intval("format", x1)
       (one integer argument, reloaded in a register if needed,
        returned as result).

- Related clean-ups in the handling of external functions and
  compiler built-ins.  In particular, __builtin_memcpy is now
  fully specified.

- ARM code generator was ported to the new ABI (EABI in ARM parlance,
  armel in Debian parlance), using VFD instructions for floating-point.
  (Successfully tested on a Trimslice platform running Ubuntu 11.04.)

- IA32 code generator:
    . Added -fno-sse option to prevent generation of SSE instructions
      for memory copy operations.
    . More realistic modeling of the ST0 (top-of-FP-stack) register
      and of floating-point compare and branch.

- PowerPC code generator: more efficient instruction sequences generated
  for insertion in a bit field and for some comparisons against 0.


Release 1.8.2, 2011-05-24
=========================

- Support for "aligned" and "section" attributes on global variables, e.g.
    __attribute__((aligned(16))) int x;

- Experimental emulation of packed structs (flag -fpacked-structs).

- Pointer comparisons now treated as unsigned comparisons (previously: signed).
  This fixes an issue with arrays straddling the 0x8000_0000 boundary.
  Consequently, the "ofs" part of pointer values "Vptr b ofs" is
  now treated as unsigned (previously: signed).

- Elimination of unreferenced labels now performed by a separate pass
  (backend/CleanupLabels.v) and proved correct.

- Stacking pass revised: supports more flexible layout of the stack
  frame; two-step proof (Stackingproof + Machabstr2concr) merged
  into one single proof (Stackingproof).

- The requirement that pointers be valid in pointer comparisons
  was pushed through all intermediate languages of the back-end
  (previously: requirement present only up to Csharpminor).

- Emulation of assignment between structs and between unions was
  simplified and made more efficient, thanks to a better implementation
  of __builtin_memcpy.

- Improvements to the compiler driver:
    .  -E option now prints preprocessed result to standard output
       instead of saving it in a .i file
    .  support for .s (assembly) and .S (assembly to be preprocessed)
       input files


Release 1.8.1, 2011-03-14
=========================

- Adapted to Coq 8.3pl1.

- Reduced compilation times through several algorithmic improvements
  (contributed by A. Pilkiewicz).

- In the various semantics, allow float-to-int conversions to fail
  (if the float argument is outside the range of representable ints).

- Initialization of global C variables made more robust and proved correct.

- ABI conformance improved:
     . the "char" type is now signed for x86, remains unsigned for PowerPC and ARM
     . placement of bit-fields now follows SVR4 conventions (affects PowerPC)

- Bug fixes in the C pre-simplifier:
     . nontermination with some recursive struct types
     . issues with zero-width bit fields
     . elimination of struct assignments duplicating some volatile accesses


Release 1.8, 2010-09-21
=======================

- The input language to the proved part of the compiler is no longer
  Clight but CompCert C: a larger subset of the C language supporting
  in particular side-effects within expressions.  The transformations
  that pull side effects out of expressions and materialize implicit
  casts, formerly performed by untrusted Caml code, are now fully
  proved in Coq.

- New port targeting Intel/AMD x86 processors.  Generates 32-bit x86 code
  using SSE2 extensions for floating-point arithmetic.  Works under
  Linux, MacOS X, and the Cygwin environment for Windows.
  CompCert's compilation strategy is not a very good match for the
  x86 architecture, therefore the performance of the generated code
  is not as good as for the PowerPC port, but still usable.
  (About 75% of the performance of gcc -O1 for x86, compared with
   > 90% for PowerPC.)

- More faithful semantics for volatile accesses:
  . volatile reads and writes from a volatile global variable are treated
    like input and output system calls (respectively), bypassing
    the memory model entirely;
  . volatile reads and writes from other locations are treated like
    regular loads and stores.

- Introduced __builtin_memcpy() and __builtin_memcpy_words(), use them
  instead of memcpy() to compile struct and union assignments.

- Introduced __builtin_annotation() to transmit assertions from
  the source program all the way to the generated assembly code.

- Elimination of some useless casts around "&", "|" and "^" bitwise operators.

- Produce fewer "moves" during RTL generation.  This speeds up the
  rest of compilation and slightly improves the result of register
  allocation when register pressure is high.

- Improvements in register allocation:
  . Implemented a spilling heuristic during register allocation.
    This heuristic reduces significantly the amount of spill code
    generated when register pressure is high.
  . More coalescing between low-pressure and high-pressure variables.
  . Aggressive coalescing between pairs of spilled variables.

- Fixed some bugs in the emulation of bit fields.


Release 1.7.1, 2010-04-13
=========================

Bug fixes in the new C pre-simplifier:
- Missing cast on return value for some functions
- Incorrect simplification of some uses of || and &&
- Nontermination in the presence of a bit field of size exactly 32 bits.
- Global initializers for structs containing bit fields.
- Wrong type in volatile reads from variables of type 'unsigned int'.

Small improvements to the PowerPC port:
- Added __builtin_trap() built-in function.
- Support for '#pragma reserve_register' (EABI)
- Less aggressive alignment of global variables.
- Generate '.type' and '.size' directives (EABI).


Release 1.7, 2010-03-31
=======================

- New implementation of the C type-checker, simplifier, and translation to
  Clight.  Compared with the previous CIL-based solution, the new
  implementation is more modular and supports more optional simplifications.

- More features of the C language are handled by expansion during
  translation to Clight:
    . assignment between structs and unions (option -fstruct-assign)
    . passing structs and union by value (option -fstruct-passing)
    . bit-fields in structs (option -fbitfields)

- The "volatile" modifier is now honored.  Volatile accesses are represented
  in Clight by calls to built-in functions, which are preserved throughout
  the compilation chain, then turned into processor loads and stores
  at the end.

- Generic support for C built-in functions.  These predefined external
  functions give access to special instructions of the processor.  See
  powerpc/CBuiltins.ml for the list of PowerPC built-in functions.

- The memory model now exposes the bit-level in-memory representation
  of integers and floats.  This strengthens the semantic preservation
  theorem: we now prove that C code that directly manipulates these
  bit-level representations (e.g. via a union between floats and integers)
  is correctly compiled.

- The memory model now supports fine-grained access control to individual
  bytes of a memory block.  This feature is currently unused in the
  compiler proofs, but will facilitate connections with separation logics
  later.

- External functions are now allowed to read and modify memory.
  The semantic preservation proofs were strengthened accordingly.
  In particular, this enables the malloc() and free() C library functions
  to be modeled as external functions in a provably correct manner.

- Minor improvements in the handling of global environments and the
  construction of the initial memory state.

- Bug fixes in the handling of '#pragma section' and '#pragma set_section'.

- The C test suite was enriched and restructured.


Release 1.6, 2010-01-13
=======================

- Support Clight initializers of the form "int * x = &y;".

- Fixed spurious compile-time error on Clight initializers of the form
  "const enum E x[2] = { E_1, E_2 };".

- Produce informative error message if a 'return' without argument
  occurs in a non-void function, or if a 'return' with an argument
  occurs in a void function.

- Preliminary support for '#pragma section' and '#pragma set_section'.

- Preliminary support for small data areas in PowerPC code generator.

- Back-end: added support for jump tables; used them to compile
  dense 'switch' statements.

- PowerPC code generator: force conversion to single precision before
  doing a "store single float" instruction.


Release 1.5, 2009-08-28
=======================

- Support for "goto" in the source language Clight.

- Added small-step semantics for Clight.

- Traces for diverging executions are now uniquely defined;
  tightened semantic preservation results accordingly.

- Emulated assignments between structures
  (during the C to Clight initial translation).

- Fixed spurious compile-time error on Clight statements of the form
  "x = f(...);" where x is a global variable.

- Fixed spurious compile-time error on Clight initializers where
  the initial value is the result of a floating-point computation
  (e.g. "double x = 3.14159 / 2;").

- Simplified the interface of the generic dataflow solver.

- Reduced running time and memory requirements for the constant propagation
  pass.

- Improved the implementation of George and Appel's graph coloring heuristic:
  runs faster, produces better results.

- Revised the implementation of branch tunneling.

- Improved modularization between processor-dependent and
  processor-independent parts.


Release 1.4.1, 2009-06-05
=========================

- Adapted to Coq 8.2-1.  No changes in functionality.

Release 1.4, 2009-04-20
=======================

- Modularized the processor dependencies in the back-end.

- Three target architectures are now supported:
       PowerPC / MacOS X       (most mature)
       PowerPC / EABI & Linux  (getting stable)
       ARM / Linux EABI        (still experimental)

- Added alignment constraints to the memory model.

- Clight: added support for conditional expressions (a ? b : c);
  removed support for array accesses a[i], now a derived form.

- C front-end: honor "static" modifiers on globals.

- New optimization over RTL: turning calls into tail calls when possible.

- Instruction selection pass: elimination of redundant casts following
  a memory load of a "small" memory quantity.

- Linearization pass: improved the linearization heuristic.

- Reloading pass: more economical use of temporaries.

- Back-end: removed "alloc heap" instruction; removed pointer validity
  checks in pointer comparisons.


Release 1.3, 2008-08-11
=======================

- Added "goto" and labeled statements to Cminor.  Extended RTLgen and
    its proof accordingly.

- Introduced small-step transition semantics for Cminor; used it in
    proof of RTLgen pass; proved consistency of Cminor big-step semantics
    w.r.t. transition semantics.

- Revised division of labor between the Allocation pass and the Reload pass.
    The semantics of LTL and LTLin no longer need to anticipate the passing
    of arguments through the conventional locations.

- Cleaned up Stacking pass: the positions of the back link and of
    the return address in the stack frame are no longer hard-wired
    in the Mach semantics.

- Added operator to convert from float to unsigned int; used it in C front-end

- Added flag -fmadd to control recognition of fused multiply-add and -sub

- Semantics of pointer-pointer comparison in Clight was incomplete:
    pointers within different blocks can now be compared using == or !=

- Addition integer + pointer is now supported in Clight.

- Improved instruction selection for complex conditions involving || and &&.

- Improved translation of Cminor "switch" statements to RTL decision trees.

- Fixed error in C parser and simplifier related to "for" loops with
    complex expressions as condition.

- More benchmark programs in test/


Release 1.2, 2008-04-03
=======================

- First public release