aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: 4aeb68c337d1a585a455a6f196cc97fa8739b4ad (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
Fiat-Crypto: Synthesizing Correct-by-Construction Code for Cryptographic Primitives
=====

Building
-----
[![Build Status](https://api.travis-ci.org/mit-plv/fiat-crypto.png?branch=master)](https://travis-ci.org/mit-plv/fiat-crypto)

This repository requires coq 8.8 or later. 8.7 may work, but we don't use it ourselves.

Git submodules are used for some dependencies. If you did not clone with `--recursive`, run

    git submodule update --init --recursive

To build (if your COQPATH variable is empty):

       make

To build:

       export COQPATH="$(pwd)/coqprime${COQPATH:+:}$COQPATH"
       make

Usage
-----

The Coq development builds binary compilers that generate code using some implementation strategy.
The parameters (modulus, hardware multiplication input bitwidth, etc.) are are specified on the command line of the compiler.
The generated C code is written to standard output.

A collection of C files for popular curves can be made with

    make c-files

The C files will appear in the top-level directory.

Just the compilers generating these C files can be made with

    make standalone

or `make standalone-haskell` or `make standalone-ocaml` for binaries generated with just one compiler.
The binaries are located in `src/ExtractionOcaml/` and `src/ExtractionHaskell` respectively.

There is a separate compiler binary for each implementation strategy:

 - `saturated_solinas`
 - `unsaturated_solinas`
 - `word_by_word_montgomery`

Passing no arguments, or passing `-h` or `--help` (or any other invalid arguments) will result in a usage message being printed.  These binaries output C code on stdout.

Here are some examples of ways to invoke the binaries (from the directories that they live in):

    # Generate code for 2^255-19
    ./unsaturated_solinas '25519' '5' '2^255' '1,19' '64' carry_mul carry_square carry_scmul121666 carry add sub opp selectznz to_bytes from_bytes > curve25519_64.c
    ./unsaturated_solinas '25519' '10' '2^255' '1,19' '32' carry_mul carry_square carry_scmul121666 carry add sub opp selectznz to_bytes from_bytes > curve25519_32.c

    # Generate code for NIST-P256 (2^256 - 2^224 + 2^192 + 2^96 - 1)
    ./word_by_word_montgomery 'p256' '2^256' '2^224,1;2^192,-1;2^96,-1;1,1' '32' > p256_32.c
    ./word_by_word_montgomery 'p256' '2^256' '2^224,1;2^192,-1;2^96,-1;1,1' '64' > p256_64.c
    
Reading About The Code
----------------------

- [Andres Erbsen, Jade Philipoom, Jason Gross, Robert Sloan, Adam Chlipala. Simple High-Level Code For Cryptographic Arithmetic -- With Proofs, Without Compromises. To Appear in Proceedings of the IEEE Symposium on Security & Privacy 2019 (S&P'19). May 2019.](http://adam.chlipala.net/papers/FiatCryptoSP19/FiatCryptoSP19.pdf). This paper describes multiple field arithmetic implementations, and an older version of the compilation pipeline (preserved [here](https://github.com/mit-plv/fiat-crypto/tree/sp2019latest)). It is somewhat space-constrained, so some details are best read about in theses below.
- [Jade Philipoom. Correct-by-Construction Finite Field Arithmetic in Coq. MIT Master's Thesis. February 2018.](http://adam.chlipala.net/theses/jadep_meng.pdf) Chapters 3 and 4 contain a detailed walkthrough of the field arithmetic implementations (again, targeting the previous compilation pipeline).
- [Andres Erbsen. Crafting Certified Elliptic CurveCryptography Implementations in Coq. MIT Master's Thesis. June 2017.](
http://adam.chlipala.net/theses/andreser_meng.pdf) Section 3 contains a whirlwind introduction to synthesizing field arithmetic code using coq, without assuming Coq skills, but covering a tiny fraction of the overall library. Sections 5 and 6 contain the only write-up on the ellitpic-curve library in this repository.
- The newest compilation pipeline does not have a separate document yet, but this README does go over it in some detail.


Reading The Code
----------------

### Demo of Synthesis

The idea of the synthesis process is demoed in [`src/Demo.v`](./src/Demo.v).
We strongly recommend reading this before studying the full-scale system.

### Actual Synthesis Pipeline

The ordering of files (eliding `*Proofs.v` files) is:

```
                     Language.v ←──────────────────────────────────────────────────┬───────────────────────────────┐
                     ↗           ↖                                                 │                               │
                  ↗                  ↖                                             │                               │
                 UnderLets.v          GENERATEDIdentifiersWithoutTypes.v     MiscCompilerPasses.v  PushButtonSynthesis/ReificationCache.v
                    ↑       ↖        ↗                                             ↑                               ↑
AbstractInterpretation.v    Rewriter.v                                             │                               │
         ↑                  ↑ ┌────────────────────────────────────────────────────┘                               │
CStringification.v          │ │               Arithmetic.v                                                         │
   ↑    ┌───────────────────┴─┘                   ↑                                                                │
BoundsPipeline.v                        COperationSpecifications.v                                                 │
      ↑                                           ↑                                                                │
      │  ┌────────────────────────────────────────┴────────────────────────────────────────────────────────────────┘
PushButtonSynthesis.v ←── Toplevel2.v ←───────────┐
   ↑                                              │
CLI.v                         SlowPrimeSynthesisExamples.v
↑  ↑
│  └────────────────────────────┐
StandaloneHaskellMain.v   StandaloneOCamlMain.v
        ↑                           ↑
ExtractionHaskell.v          ExtractionOCaml.v
```

The files contain:

- Arithmetic.v: All of the high-level field arithmetic stuff

- Language.v:
  + PHOAS
  + reification
  + denotation/intepretation
  + utilities for inverting PHOAS exprs
  + default/dummy values of PHOAS exprs
  + default instantiation of generic PHOAS types
  + gallina reification of ground terms
  + Flat/indexed syntax trees, and conversions to and from PHOAS
  Defines the passes:
  + ToFlat
  + FromFlat
  + GeneralizeVar

- UnderLets.v: the UnderLets monad, a pass that does substitution of var-like
  things, a pass that inserts let-binders in the next-to-last line of code,
  substituting away var-like things (this is used to ensure that when we output
  C code, aliasing the input and the output arrays doesn't cause issues).
  Defines the passes:
  + SubstVarFstSndPairOppCast

- AbstractInterpretation.v: type-code-based ZRange definitions, abstract
  interpretation of identifiers (which does let-lifting, for historical reasons,
  and the dependency on UnderLets should probably be removed), defines the
  passes:
  + PartialEvaluateWithBounds
  + PartialEvaluateWithListInfoFromBounds
  + CheckPartialEvaluateWithBounds

- GENERATEDIdentifiersWithoutTypes.v: generated by a python script which is
  included in a comment in the file, this is an untyped version of identifiers
  for the rewriter

- Rewriter.v: rewrite rules, rewriting.  Defines the passes:
  + Rewrite
  + RewriteToFancy
  + PartialEvaluate (which is just a synonym for Rewrite)

- MiscCompilerPasses.v: Defines the passes:
  + EliminateDead (dead code elimination)
  + Subst01 (substitute let-binders used 0 or 1 times)

- CStringification.v: conversion to C code as strings.  (Depends on
  AbstractInterpretation.v for ZRange utilities.)  Defines the passes:
  + ToString.ToFunctionLines
  + ToString.ToFunctionString
  + ToString.LinesToString

- CompilersTestCases.v: Various test cases to ensure everything is working

- BoundsPipeline.v: Assemble the various compiler passes together into
  a composed pipeline.  It is the final interface for the compiler.
  Also contains some tactics for applying the BoundsPipeline
  correctness lemma.

- COperationSpecifications.v: The specifications for the various
  operations to be synthesized.

- PushButtonSynthesis/ReificationCache.v: Defines the cache that holds
  reified versions of operations, as well as the tactics that reify
  and apply things from the cache.

- PushButtonSynthesis.v: Reifies the various operations from
  Arithmetic.v, definies the compositions of the BoundsPipeline with
  these operations, proves that their interpretations satisfies the
  specs from COperationSpecifications.v, assembles the reified
  post-bounds operations into synthesis targets.  This is the file
  that CLI.v depends on.

- Toplevel2.v: Some not-quite-finished-but-kind-of-slow pipeline stuff
  + all the stuff that uses compilers + arithmetic, together with more
  examples.  Also has semi-broken fancy-machine stuff.  This should
  probably be merged into Toplevel1.v when working on the pipeline.

- SlowPrimeSynthesisExamples.v: Additional uses of the pipeline for
  primes that are kind-of slow, which I don't want extraction blocking
  on.

- CLI.v: Setting up all of the language-independent parts of extraction; relies
  on having a list of strings-or-error-messages for each pipeline, and on the
  arguments to that pipeline, and builds a parser for command line arguments for
  that.

- StandaloneHaskellMain.v, StandaloneOCamlMain.v, ExtractionHaskell.v,
  ExtractionOCaml.v: Extraction of pipeline to various languages


Proofs files:
For Language.v, there is a semi-arbitrary split between two files
`LanguageInversion` and `LanguageWf`.
- LanguageInversion.v:
  + classifies equality of type codes and exprs
  + type codes have decidable equality
  + correctness of the various type-transport definitions
  + correctness lemmas for the various `expr.invert_*` definitions
  + correctness lemmas for the various `reify_*` definitions in Language.v
  + inversion_type, which inverts equality of type codes
  + type_beq_to_eq, which converts boolean equality of types to
    Leibniz equality
  + rewrite_type_transport_correct, which rewrites with the
    correctness lemmas of the various type-transport definitions
  + `type.invert_one e` which does case analysis on any inductive type
     indexed over type codes, in a way that preserves information
     about the type of `e`, and generally works even when the goal is
     dependently typed over `e` and/or its type
  + ident.invert, which does case-anaylsis on idents whose type has
    structure (i.e., is not a var)
  + ident.invert_match, which does case-analysis on idents appearing as
    the discriminee of a `match` in the goal or in any hypothesis
  + expr.invert, which does case-anaylsis on exprs whose type has
    structure (i.e., is not a var)
  + expr.invert_match, which does case-analysis on exprs appearing as
    the discriminee of a `match` in the goal or in any hypothesis
  + expr.invert_subst, which does case-analysis on exprs which show up
    in hypotheses of the form `expr.invert_* _ = Some _`
  + expr.inversion_expr, which inverts equalities of exprs

- LanguageWf.v: Depends on LanguageInversion.v
  Defines:
  + expr.wf, expr.Wf, expr.wf3, expr.Wf3
  + GeneralizeVar.Flat.wf
  + expr.inversion_wf (and variants), which invert `wf` hypotheses
  + expr.wf_t (and variants wf_unsafe_t and wf_safe_t) which make
     progress on `wf` goals; `wf_safe_t` should never turn a provable
     goal into an unprovable one, while `wf_unsafe_t` might.
  + expr.interp_t (and variants), which should make progress on
    equivalence-of-interp hypotheses and goals, but is not used much
    (mainly because I forgot I had defined it)
  + prove_Wf, which proves wf goals on concrete syntax trees in a more
    optimized way than `repeat constructor`
  Proves:
  + funext → (type.eqv ↔ Logic.eq) (`eqv_iff_eq_of_funext`)
  + type.related and type.eqv are PERs
  + Proper instances for type.app_curried, type.and_for_each_lhs_of_arrow
  + type.is_not_higher_order → Reflexive (type.and_for_each_lhs_of_arrow type.eqv)
  + iff between type.related{,_hetero} and related of type.app_curried
  + various properties of type.and{,b_bool}for_each_lhs_of_arrow
  + various properties of type.eqv and ident.{gen_,}interp
  + various properties of ident.cast
  + various properties of expr.wf (particularly of things defined in Language.v)
  + interp and wf proofs for the passes to/from Flat

- UnderLetsProofs.v: wf and interp lemmas for the various passes defined in UnderLets.v
- MiscCompilerPassesProofs.v: wf and interp lemmas for the various passes defined in MiscCompilerPasses.v
- GENERATEDIdentifiersWithoutTypesProofs.v: various lemmas about the definitions in GENERATEDIdentifiersWithoutTypes.v
- AbstractInterpretationZRangeProofs.v: Proves correctness lemmas of the per-operation zrange-bounds-analysis functions
- AbstractInterpretationWf.v: wf lemmas for the AbstractInterpretation pass
- AbstractInterpretationProofs.v: interp lemmas for the
  AbstractInterpretation pass, and also correctness lemmas that
  combine Wf and interp
- RewriterWf1.v: a semi-arbitrary collection of proofs and definitions
  that are mostly shared between the wf proofs and the interp proofs.
  Importantly, this file defines everything needed to state and prove
  that specific rewrite rules are correct.
- RewriterRulesGood.v: Proves that all specific rewriter rules are
  good enough for wf proofs (as specified in RewriterWf1.v)
- RewriterRulesInterpGood.v: Proves that all specific rewriter rules
  are good enough for interp proofs (as specified in RewriterWf1.v)
- RewriterWf2.v: Proves wf of the generic rewriter pass, taking in
  specific-rewriter-rule-correctness as a hypothesis
- RewriterInterpProofs1.v: Proves interp-correctness of the generic
  rewriter pass, taking in specific-rewriter-rule-correctness as a
  hypothesis
- RewriterProofs.v: assembles the proofs from RewriterRulesGood.v,
  RewriterRulesInterpGood.v, RewriterWf2.v, and
  RewriterInterpProofs1.v into correctness lemmas for the overall
  rewriter pass, specialized and pre-evaluated for specific rewrite
  rules

There are also proofs about elliptic curves, for example:

- [`src/Curves/Edwards/AffineProofs.v`](./src/Curves/Edwards/AffineProofs.v),
- [`src/Curves/Edwards/XYZT/Basic.v`](./src/Curves/Edwards/XYZT/Basic.v),
- [`src/Curves/Montgomery/AffineProofs.v`](./src/Curves/Montgomery/AffineProofs.v),
- [`src/Curves/Montgomery/XZProofs.v`](src/Curves/Montgomery/XZProofs.v).