aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorGravatar Andres Erbsen <andreser@mit.edu>2019-01-08 04:21:38 -0500
committerGravatar Andres Erbsen <andreser@mit.edu>2019-01-09 22:49:02 -0500
commit3ca227f1137e6a3b65bc33f5689e1c230d591595 (patch)
treee1e5a2dd2a2f34f239d3276227ddbdc69eeeb667 /README.md
parent3ec21c64b3682465ca8e159a187689b207c71de4 (diff)
remove old pipeline
Diffstat (limited to 'README.md')
-rw-r--r--README.md339
1 files changed, 224 insertions, 115 deletions
diff --git a/README.md b/README.md
index 0215e2e2c..5417ee120 100644
--- a/README.md
+++ b/README.md
@@ -1,59 +1,45 @@
-Fiat-Crypto: Synthesizing Correct-by-Construction Code for Cryptographic Primitives
+naFiat-Crypto: Synthesizing Correct-by-Construction Code for Cryptographic Primitives
=====
-[![Build Status](https://api.travis-ci.org/mit-plv/fiat-crypto.png?branch=master)](https://travis-ci.org/mit-plv/fiat-crypto)
-
-See [src/README.md](src/README.md).
-
-OLD PIPELINE README BELOW
-=========================
-
-
-Build Requirements
+Building
-----
-This repository requires:
-- To build the proofs and almost-C-like outputs: [Coq 8.7](https://github.com/coq/coq/releases/tag/V8.7.2) or [Coq 8.8](https://github.com/coq/coq/releases/tag/V8.8.0) (tested with 8.7.2, 8.8.0)
-If you checked out this repository via `git`, you should run
+This repository requires coq 8.8 or later. 8.7 may work, but we don't use it ourselves.
+
+Git submodules are used for some dependencies. If you did not clone with `--recursive`, run
git submodule update --init --recursive
To build (if your COQPATH variable is empty):
- make
+ make
To build:
- export COQPATH="$(pwd)/coqprime${COQPATH:+:}$COQPATH"
- make
-
-You may get non-fatal errors such as `make: execvp: /bin/sh: Argument list too long`; this is an artifact of the [old pipeline](#old-pipeline) and is a [bug in Coq](https://github.com/coq/coq/issues/7584) that will prevent `make install` from succeeding, but will not impact the build itself.
+ export COQPATH="$(pwd)/coqprime${COQPATH:+:}$COQPATH"
+ make
-New Pipeline
+Usage
-----
-The new pipeline (nearly finished) generates binaries that take in arguments on the command line and print out C code on stdout.
-A collection of C files can be made with
+The Coq development builds binary compilers that generate code using some implementation strategy.
+The parameters (modulus, hardware multiplication input bitwidth, etc.) are are specified on the command line of the compiler.
+The generated C code is written to standard output.
+
+A collection of C files for popular curves can be made with
make c-files
The C files will appear in the top-level directory.
-The binaries generating these C files can be made with
+Just the compilers generating these C files can be made with
make standalone
or `make standalone-haskell` or `make standalone-ocaml` for binaries generated with just one compiler.
+The binaries are located in `src/ExtractionOcaml/` and `src/ExtractionHaskell` respectively.
-The binaries are located in
-
- src/Experiments/NewPipeline/ExtractionOCaml/
-
-or
-
- src/Experiments/NewPipeline/ExtractionHaskell/
-
-The binaries are:
+There is a separate compiler binary for each implementation strategy:
- `saturated_solinas`
- `unsaturated_solinas`
@@ -71,95 +57,218 @@ Here are some examples of ways to invoke the binaries (from the directories that
./word_by_word_montgomery 'p256' '2^256' '2^224,1;2^192,-1;2^96,-1;1,1' '32' > p256_32.c
./word_by_word_montgomery 'p256' '2^256' '2^224,1;2^192,-1;2^96,-1;1,1' '64' > p256_64.c
-Old Pipeline
-----
-
-To build a representative subset, instead of `make`, run
-
- make selected-specific-display non-specific
-
-The core library (target `nonautogenerated-specific-display non-specific`) is expected to build within about one hour (in serial, on a particularly fast machine), within about 3.5--4 GB of RAM, plus an additional 1-5 minutes to run `coqdep` on all of our files. Of the additional curves, 90% finish within 5 minutes per file, and 99% finish within 15 minutes per file (though some take upwards of 50 GB of RAM). These 99% of curves together take about 60 hours.
-
-- To build the C outputs: Python (2 or 3)
-
-To build:
-
- make c
-
-or, for a representative subset,
-
- make selected-c
-
-- To build and run the binaries, gcc (7.1.2; or 7.1.1 with `-fno-peephole2`) or clang of a new enough version to support the appropriate compiler intrinsics.
-
-To build and run:
-
- make test bench
-or, for a representative subset,
+Reading The Code
+----------------
- make selected-test selected-bench
-
-Exploring the code
------
-
-### Push-button synthesis
-
-To add a new prime, add a new line to [`primes.txt`](https://github.com/mit-plv/fiat-crypto/blob/master/primes.txt) with your prime. Then run
-
- ./generate_parameters.py; ./src/Specific/CurveParameters/remake_curves.sh
-
-You will see a bunch of lines starting with `git add`; any `.v` files that show up in these lines must be added to `_CoqProject`. (If you are working in the git repository, you may simply run the `git add` lines, and then run `make update-_CoqProject`.) Note that the directory structure involves a textual representation of your prime, call it `${REPRESENTATION_OF_YOUR_PRIME}`. To build, you can run
-
- make src/Specific/{montgomery,solinas}{32,64}_${REPRESENTATION_OF_YOUR_PRIME}/fibe
-
-This will build all the necessary `.vo` and `.c` files, as well as the `fibe` binary, which is automatically run when you `make bench` or `make test`.
+### Demo of Synthesis
-### Curve Correctness Theorems
+The idea of the synthesis process is demoed in [`src/Demo.v`](./src/Demo.v).
+We strongly recommend reading this before studying the full-scale system.
+
+### Actual Synthesis Pipeline
+
+The ordering of files (eliding `*Proofs.v` files) is:
+
+```
+ Language.v ←──────────────────────────────────────────────────┬───────────────────────────────┐
+ ↗ ↖ │ │
+ ↗ ↖ │ │
+ UnderLets.v GENERATEDIdentifiersWithoutTypes.v MiscCompilerPasses.v PushButtonSynthesis/ReificationCache.v
+ ↑ ↖ ↗ ↑ ↑
+AbstractInterpretation.v Rewriter.v │ │
+ ↑ ↑ ┌────────────────────────────────────────────────────┘ │
+CStringification.v │ │ Arithmetic.v │
+ ↑ ┌───────────────────┴─┘ ↑ │
+BoundsPipeline.v COperationSpecifications.v │
+ ↑ ↑ │
+ │ ┌────────────────────────────────────────┴────────────────────────────────────────────────────────────────┘
+PushButtonSynthesis.v ←── Toplevel2.v ←───────────┐
+ ↑ │
+CLI.v SlowPrimeSynthesisExamples.v
+↑ ↑
+│ └────────────────────────────┐
+StandaloneHaskellMain.v StandaloneOCamlMain.v
+ ↑ ↑
+ExtractionHaskell.v ExtractionOCaml.v
+```
+
+The files contain:
+
+- Arithmetic.v: All of the high-level field arithmetic stuff
+
+- Language.v:
+ + PHOAS
+ + reification
+ + denotation/intepretation
+ + utilities for inverting PHOAS exprs
+ + default/dummy values of PHOAS exprs
+ + default instantiation of generic PHOAS types
+ + gallina reification of ground terms
+ + Flat/indexed syntax trees, and conversions to and from PHOAS
+ Defines the passes:
+ + ToFlat
+ + FromFlat
+ + GeneralizeVar
+
+- UnderLets.v: the UnderLets monad, a pass that does substitution of var-like
+ things, a pass that inserts let-binders in the next-to-last line of code,
+ substituting away var-like things (this is used to ensure that when we output
+ C code, aliasing the input and the output arrays doesn't cause issues).
+ Defines the passes:
+ + SubstVarFstSndPairOppCast
+
+- AbstractInterpretation.v: type-code-based ZRange definitions, abstract
+ interpretation of identifiers (which does let-lifting, for historical reasons,
+ and the dependency on UnderLets should probably be removed), defines the
+ passes:
+ + PartialEvaluateWithBounds
+ + PartialEvaluateWithListInfoFromBounds
+ + CheckPartialEvaluateWithBounds
+
+- GENERATEDIdentifiersWithoutTypes.v: generated by a python script which is
+ included in a comment in the file, this is an untyped version of identifiers
+ for the rewriter
+
+- Rewriter.v: rewrite rules, rewriting. Defines the passes:
+ + Rewrite
+ + RewriteToFancy
+ + PartialEvaluate (which is just a synonym for Rewrite)
+
+- MiscCompilerPasses.v: Defines the passes:
+ + EliminateDead (dead code elimination)
+ + Subst01 (substitute let-binders used 0 or 1 times)
+
+- CStringification.v: conversion to C code as strings. (Depends on
+ AbstractInterpretation.v for ZRange utilities.) Defines the passes:
+ + ToString.ToFunctionLines
+ + ToString.ToFunctionString
+ + ToString.LinesToString
+
+- CompilersTestCases.v: Various test cases to ensure everything is working
+
+- BoundsPipeline.v: Assemble the various compiler passes together into
+ a composed pipeline. It is the final interface for the compiler.
+ Also contains some tactics for applying the BoundsPipeline
+ correctness lemma.
+
+- COperationSpecifications.v: The specifications for the various
+ operations to be synthesized.
+
+- PushButtonSynthesis/ReificationCache.v: Defines the cache that holds
+ reified versions of operations, as well as the tactics that reify
+ and apply things from the cache.
+
+- PushButtonSynthesis.v: Reifies the various operations from
+ Arithmetic.v, definies the compositions of the BoundsPipeline with
+ these operations, proves that their interpretations satisfies the
+ specs from COperationSpecifications.v, assembles the reified
+ post-bounds operations into synthesis targets. This is the file
+ that CLI.v depends on.
+
+- Toplevel2.v: Some not-quite-finished-but-kind-of-slow pipeline stuff
+ + all the stuff that uses compilers + arithmetic, together with more
+ examples. Also has semi-broken fancy-machine stuff. This should
+ probably be merged into Toplevel1.v when working on the pipeline.
+
+- SlowPrimeSynthesisExamples.v: Additional uses of the pipeline for
+ primes that are kind-of slow, which I don't want extraction blocking
+ on.
+
+- CLI.v: Setting up all of the language-independent parts of extraction; relies
+ on having a list of strings-or-error-messages for each pipeline, and on the
+ arguments to that pipeline, and builds a parser for command line arguments for
+ that.
+
+- StandaloneHaskellMain.v, StandaloneOCamlMain.v, ExtractionHaskell.v,
+ ExtractionOCaml.v: Extraction of pipeline to various languages
+
+
+Proofs files:
+For Language.v, there is a semi-arbitrary split between two files
+`LanguageInversion` and `LanguageWf`.
+- LanguageInversion.v:
+ + classifies equality of type codes and exprs
+ + type codes have decidable equality
+ + correctness of the various type-transport definitions
+ + correctness lemmas for the various `expr.invert_*` definitions
+ + correctness lemmas for the various `reify_*` definitions in Language.v
+ + inversion_type, which inverts equality of type codes
+ + type_beq_to_eq, which converts boolean equality of types to
+ Leibniz equality
+ + rewrite_type_transport_correct, which rewrites with the
+ correctness lemmas of the various type-transport definitions
+ + `type.invert_one e` which does case analysis on any inductive type
+ indexed over type codes, in a way that preserves information
+ about the type of `e`, and generally works even when the goal is
+ dependently typed over `e` and/or its type
+ + ident.invert, which does case-anaylsis on idents whose type has
+ structure (i.e., is not a var)
+ + ident.invert_match, which does case-analysis on idents appearing as
+ the discriminee of a `match` in the goal or in any hypothesis
+ + expr.invert, which does case-anaylsis on exprs whose type has
+ structure (i.e., is not a var)
+ + expr.invert_match, which does case-analysis on exprs appearing as
+ the discriminee of a `match` in the goal or in any hypothesis
+ + expr.invert_subst, which does case-analysis on exprs which show up
+ in hypotheses of the form `expr.invert_* _ = Some _`
+ + expr.inversion_expr, which inverts equalities of exprs
+
+- LanguageWf.v: Depends on LanguageInversion.v
+ Defines:
+ + expr.wf, expr.Wf, expr.wf3, expr.Wf3
+ + GeneralizeVar.Flat.wf
+ + expr.inversion_wf (and variants), which invert `wf` hypotheses
+ + expr.wf_t (and variants wf_unsafe_t and wf_safe_t) which make
+ progress on `wf` goals; `wf_safe_t` should never turn a provable
+ goal into an unprovable one, while `wf_unsafe_t` might.
+ + expr.interp_t (and variants), which should make progress on
+ equivalence-of-interp hypotheses and goals, but is not used much
+ (mainly because I forgot I had defined it)
+ + prove_Wf, which proves wf goals on concrete syntax trees in a more
+ optimized way than `repeat constructor`
+ Proves:
+ + funext → (type.eqv ↔ Logic.eq) (`eqv_iff_eq_of_funext`)
+ + type.related and type.eqv are PERs
+ + Proper instances for type.app_curried, type.and_for_each_lhs_of_arrow
+ + type.is_not_higher_order → Reflexive (type.and_for_each_lhs_of_arrow type.eqv)
+ + iff between type.related{,_hetero} and related of type.app_curried
+ + various properties of type.and{,b_bool}for_each_lhs_of_arrow
+ + various properties of type.eqv and ident.{gen_,}interp
+ + various properties of ident.cast
+ + various properties of expr.wf (particularly of things defined in Language.v)
+ + interp and wf proofs for the passes to/from Flat
+
+- UnderLetsProofs.v: wf and interp lemmas for the various passes defined in UnderLets.v
+- MiscCompilerPassesProofs.v: wf and interp lemmas for the various passes defined in MiscCompilerPasses.v
+- GENERATEDIdentifiersWithoutTypesProofs.v: various lemmas about the definitions in GENERATEDIdentifiersWithoutTypes.v
+- AbstractInterpretationZRangeProofs.v: Proves correctness lemmas of the per-operation zrange-bounds-analysis functions
+- AbstractInterpretationWf.v: wf lemmas for the AbstractInterpretation pass
+- AbstractInterpretationProofs.v: interp lemmas for the
+ AbstractInterpretation pass, and also correctness lemmas that
+ combine Wf and interp
+- RewriterWf1.v: a semi-arbitrary collection of proofs and definitions
+ that are mostly shared between the wf proofs and the interp proofs.
+ Importantly, this file defines everything needed to state and prove
+ that specific rewrite rules are correct.
+- RewriterRulesGood.v: Proves that all specific rewriter rules are
+ good enough for wf proofs (as specified in RewriterWf1.v)
+- RewriterRulesInterpGood.v: Proves that all specific rewriter rules
+ are good enough for interp proofs (as specified in RewriterWf1.v)
+- RewriterWf2.v: Proves wf of the generic rewriter pass, taking in
+ specific-rewriter-rule-correctness as a hypothesis
+- RewriterInterpProofs1.v: Proves interp-correctness of the generic
+ rewriter pass, taking in specific-rewriter-rule-correctness as a
+ hypothesis
+- RewriterProofs.v: assembles the proofs from RewriterRulesGood.v,
+ RewriterRulesInterpGood.v, RewriterWf2.v, and
+ RewriterInterpProofs1.v into correctness lemmas for the overall
+ rewriter pass, specialized and pre-evaluated for specific rewrite
+ rules
+
+There are also proofs about elliptic curves, for example:
-Specifications live in [`src/Spec`](./src/Spec), e.g., [`src/Spec/Ed25519.v`](./src/Spec/Ed25519.v).
-Some selected cuve-level proofs live in
- [`src/Curves/Edwards/AffineProofs.v`](./src/Curves/Edwards/AffineProofs.v),
- [`src/Curves/Edwards/XYZT/Basic.v`](./src/Curves/Edwards/XYZT/Basic.v),
- [`src/Curves/Montgomery/AffineProofs.v`](./src/Curves/Montgomery/AffineProofs.v),
- [`src/Curves/Montgomery/XZProofs.v`](src/Curves/Montgomery/XZProofs.v).
-
-### Arithmetic Core
-
-Generic mathematical optimization strategies live in [`src/Arithmetic`](./src/Arithmetic).
-Good examples include
-- [`src/Arithmetic/Core.v`](src/Arithmetic/Core.v),
-- [`src/Arithmetic/Karatsuba.v`](src/Arithmetic/Karatsuba.v),
-- [`src/Arithmetic/Saturated/Core.v`](./src/Arithmetic/Saturated/Core.v),
-- [`src/Arithmetic/BarrettReduction/HAC.v`](./src/Arithmetic/BarrettReduction/HAC.v),
-- [`src/Arithmetic/MontgomeryReduction/Definition.v`](./src/Arithmetic/MontgomeryReduction/Definition.v),
-- [`src/Arithmetic/MontgomeryReduction/WordByWord/Abstract/Definition.v`](./src/Arithmetic/MontgomeryReduction/WordByWord/Abstract/Definition.v).
-
-### Demo of Synthesis
-
-The idea of the synthesis process is demoed in [`src/Demo.v`](./src/Demo.v).
-
-### Actual Synthesis
-
-- The curve-non-specific synthesis framework lives in [`src/Compilers/Z/Bounds/Pipeline`](./src/Compilers/Z/Bounds/Pipeline).
-- The API for the reflective simplification and bounds analysis pipeline is in [`src/Compilers/Z/Bounds/Pipeline.v`](./src/Compilers/Z/Bounds/Pipeline.v).
-- The definition of the pipeline and the correctness proof are in [`src/Compilers/Z/Bounds/Pipeline/Definition.v`](./src/Compilers/Z/Bounds/Pipeline/Definition.v).
-- Generic reflective syntax trees are defined in [`src/Compilers/Syntax.v`](./src/Compilers/Syntax.v); the specialization to the arithmetic operations on Z that we use occurs in [`src/Compilers/Z/Syntax.v`](./src/Compilers/Z/Syntax.v).
-- The trusted pretty-printer lives in [`src/Compilers/Z/CNotations.v`](./src/Compilers/Z/CNotations.v) and [`src/Compilers/Z/HexNotationConstants.v`](./src/Compilers/Z/HexNotationConstants.v), which were generated by the couple-hundred-line Python scripts in a comment near the top of each file.
-- The core of the bounds-analysis transformation lives in [`src/Compilers/Z/Bounds/Interpretation.v`](./src/Compilers/Z/Bounds/Interpretation.v); the corresponding proofs live in [`src/Compilers/Z/Bounds/InterpretationLemmas`](./src/Compilers/Z/Bounds/InterpretationLemmas), and the instantation of the bounds-analysis transformation with the core lives in [`src/Compilers/Z/Bounds/MapCastByDeBruijn.v`](./src/Compilers/Z/Bounds/MapCastByDeBruijn.v).
-
-
-The curve-specific synthesis framework lives in [`src/Specific/Framework`](./src/Specific/Framework).
-- The input `.json` files live in [`src/Specific/CurveParameters`](./src/Specific/CurveParameters), and the processor for those files lives in [`src/Specific/Framework/make_curve.py`](./src/Specific/Framework/make_curve.py).
-- The record of curve specific parameters is defined in [`src/Specific/Framework/RawCurveParameters.v`](./src/Specific/Framework/RawCurveParameters.v), and defaults are filled in [`src/Specific/Framework/CurveParameters.v`](src/Specific/Framework/CurveParameters.v).
-- The tactics that specialize the various arithmetic optimizations to specific curves live in [`src/Specific/Framework/ArithmeticSynthesis`](./src/Specific/Framework/ArithmeticSynthesis), e.g., [`src/Specific/Framework/ArithmeticSynthesis/Montgomery.v`](./src/Specific/Framework/ArithmeticSynthesis/Montgomery.v).
-- The script at [`src/Specific/Framework/ArithmeticSynthesis/remake_packages.py`](./src/Specific/Framework/ArithmeticSynthesis/remake_packages.py) processes these files to add relevant boilerplate, and gets run after significantly changing an optimization, or after adding a new optimization.
-- The synthesis tactics are assembled in [`src/Specific/Framework/SynthesisFramework.v`](./src/Specific/Framework/SynthesisFramework.v).
-Note that magic<a name="magic-ref">[¹](#magic)</a> is used to create identifiers refering to the curve-specific parameters, from tactics, so that they can be used in other curve-specific files.
-
-### Example Output
-
-The c output lives in the various subfolders of [`src/Specific`](./src/Specific). For example, C-like code for X25519 multiplication on 64-bit machines lives in [`src/Specific/X25519/C64/femulDisplay.log`](./src/Specific/X25519/C64/femulDisplay.log), which is generated from [`src/Specific/X25519/C64/femulDisplay.v`](./src/Specific/X25519/C64/femulDisplay.v), which in turn uses synthesis run in [`src/Specific/X25519/C64/femul.v`](./src/Specific/X25519/C64/femul.v). The `c` target turns this into `src/Specific/X25519/C64/femul.c`.
-
-## Footnotes
-<a name="magic">[¹](#magic-ref)</a> This magic comes in the form of calls to `transparent_abstract`, packaged up in some tactics in [`src/Util/Tactics/CacheTerm.v`](./src/Util/Tactics/CacheTerm.v).