From 3ca227f1137e6a3b65bc33f5689e1c230d591595 Mon Sep 17 00:00:00 2001 From: Andres Erbsen Date: Tue, 8 Jan 2019 04:21:38 -0500 Subject: remove old pipeline --- README.md | 339 +++++++++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 224 insertions(+), 115 deletions(-) (limited to 'README.md') diff --git a/README.md b/README.md index 0215e2e2c..5417ee120 100644 --- a/README.md +++ b/README.md @@ -1,59 +1,45 @@ -Fiat-Crypto: Synthesizing Correct-by-Construction Code for Cryptographic Primitives +naFiat-Crypto: Synthesizing Correct-by-Construction Code for Cryptographic Primitives ===== -[![Build Status](https://api.travis-ci.org/mit-plv/fiat-crypto.png?branch=master)](https://travis-ci.org/mit-plv/fiat-crypto) - -See [src/README.md](src/README.md). - -OLD PIPELINE README BELOW -========================= - - -Build Requirements +Building ----- -This repository requires: -- To build the proofs and almost-C-like outputs: [Coq 8.7](https://github.com/coq/coq/releases/tag/V8.7.2) or [Coq 8.8](https://github.com/coq/coq/releases/tag/V8.8.0) (tested with 8.7.2, 8.8.0) -If you checked out this repository via `git`, you should run +This repository requires coq 8.8 or later. 8.7 may work, but we don't use it ourselves. + +Git submodules are used for some dependencies. If you did not clone with `--recursive`, run git submodule update --init --recursive To build (if your COQPATH variable is empty): - make + make To build: - export COQPATH="$(pwd)/coqprime${COQPATH:+:}$COQPATH" - make - -You may get non-fatal errors such as `make: execvp: /bin/sh: Argument list too long`; this is an artifact of the [old pipeline](#old-pipeline) and is a [bug in Coq](https://github.com/coq/coq/issues/7584) that will prevent `make install` from succeeding, but will not impact the build itself. + export COQPATH="$(pwd)/coqprime${COQPATH:+:}$COQPATH" + make -New Pipeline +Usage ----- -The new pipeline (nearly finished) generates binaries that take in arguments on the command line and print out C code on stdout. -A collection of C files can be made with +The Coq development builds binary compilers that generate code using some implementation strategy. +The parameters (modulus, hardware multiplication input bitwidth, etc.) are are specified on the command line of the compiler. +The generated C code is written to standard output. + +A collection of C files for popular curves can be made with make c-files The C files will appear in the top-level directory. -The binaries generating these C files can be made with +Just the compilers generating these C files can be made with make standalone or `make standalone-haskell` or `make standalone-ocaml` for binaries generated with just one compiler. +The binaries are located in `src/ExtractionOcaml/` and `src/ExtractionHaskell` respectively. -The binaries are located in - - src/Experiments/NewPipeline/ExtractionOCaml/ - -or - - src/Experiments/NewPipeline/ExtractionHaskell/ - -The binaries are: +There is a separate compiler binary for each implementation strategy: - `saturated_solinas` - `unsaturated_solinas` @@ -71,95 +57,218 @@ Here are some examples of ways to invoke the binaries (from the directories that ./word_by_word_montgomery 'p256' '2^256' '2^224,1;2^192,-1;2^96,-1;1,1' '32' > p256_32.c ./word_by_word_montgomery 'p256' '2^256' '2^224,1;2^192,-1;2^96,-1;1,1' '64' > p256_64.c -Old Pipeline ----- - -To build a representative subset, instead of `make`, run - - make selected-specific-display non-specific - -The core library (target `nonautogenerated-specific-display non-specific`) is expected to build within about one hour (in serial, on a particularly fast machine), within about 3.5--4 GB of RAM, plus an additional 1-5 minutes to run `coqdep` on all of our files. Of the additional curves, 90% finish within 5 minutes per file, and 99% finish within 15 minutes per file (though some take upwards of 50 GB of RAM). These 99% of curves together take about 60 hours. - -- To build the C outputs: Python (2 or 3) - -To build: - - make c - -or, for a representative subset, - - make selected-c - -- To build and run the binaries, gcc (7.1.2; or 7.1.1 with `-fno-peephole2`) or clang of a new enough version to support the appropriate compiler intrinsics. - -To build and run: - - make test bench -or, for a representative subset, +Reading The Code +---------------- - make selected-test selected-bench - -Exploring the code ------ - -### Push-button synthesis - -To add a new prime, add a new line to [`primes.txt`](https://github.com/mit-plv/fiat-crypto/blob/master/primes.txt) with your prime. Then run - - ./generate_parameters.py; ./src/Specific/CurveParameters/remake_curves.sh - -You will see a bunch of lines starting with `git add`; any `.v` files that show up in these lines must be added to `_CoqProject`. (If you are working in the git repository, you may simply run the `git add` lines, and then run `make update-_CoqProject`.) Note that the directory structure involves a textual representation of your prime, call it `${REPRESENTATION_OF_YOUR_PRIME}`. To build, you can run - - make src/Specific/{montgomery,solinas}{32,64}_${REPRESENTATION_OF_YOUR_PRIME}/fibe - -This will build all the necessary `.vo` and `.c` files, as well as the `fibe` binary, which is automatically run when you `make bench` or `make test`. +### Demo of Synthesis -### Curve Correctness Theorems +The idea of the synthesis process is demoed in [`src/Demo.v`](./src/Demo.v). +We strongly recommend reading this before studying the full-scale system. + +### Actual Synthesis Pipeline + +The ordering of files (eliding `*Proofs.v` files) is: + +``` + Language.v ←──────────────────────────────────────────────────┬───────────────────────────────┐ + ↗ ↖ │ │ + ↗ ↖ │ │ + UnderLets.v GENERATEDIdentifiersWithoutTypes.v MiscCompilerPasses.v PushButtonSynthesis/ReificationCache.v + ↑ ↖ ↗ ↑ ↑ +AbstractInterpretation.v Rewriter.v │ │ + ↑ ↑ ┌────────────────────────────────────────────────────┘ │ +CStringification.v │ │ Arithmetic.v │ + ↑ ┌───────────────────┴─┘ ↑ │ +BoundsPipeline.v COperationSpecifications.v │ + ↑ ↑ │ + │ ┌────────────────────────────────────────┴────────────────────────────────────────────────────────────────┘ +PushButtonSynthesis.v ←── Toplevel2.v ←───────────┐ + ↑ │ +CLI.v SlowPrimeSynthesisExamples.v +↑ ↑ +│ └────────────────────────────┐ +StandaloneHaskellMain.v StandaloneOCamlMain.v + ↑ ↑ +ExtractionHaskell.v ExtractionOCaml.v +``` + +The files contain: + +- Arithmetic.v: All of the high-level field arithmetic stuff + +- Language.v: + + PHOAS + + reification + + denotation/intepretation + + utilities for inverting PHOAS exprs + + default/dummy values of PHOAS exprs + + default instantiation of generic PHOAS types + + gallina reification of ground terms + + Flat/indexed syntax trees, and conversions to and from PHOAS + Defines the passes: + + ToFlat + + FromFlat + + GeneralizeVar + +- UnderLets.v: the UnderLets monad, a pass that does substitution of var-like + things, a pass that inserts let-binders in the next-to-last line of code, + substituting away var-like things (this is used to ensure that when we output + C code, aliasing the input and the output arrays doesn't cause issues). + Defines the passes: + + SubstVarFstSndPairOppCast + +- AbstractInterpretation.v: type-code-based ZRange definitions, abstract + interpretation of identifiers (which does let-lifting, for historical reasons, + and the dependency on UnderLets should probably be removed), defines the + passes: + + PartialEvaluateWithBounds + + PartialEvaluateWithListInfoFromBounds + + CheckPartialEvaluateWithBounds + +- GENERATEDIdentifiersWithoutTypes.v: generated by a python script which is + included in a comment in the file, this is an untyped version of identifiers + for the rewriter + +- Rewriter.v: rewrite rules, rewriting. Defines the passes: + + Rewrite + + RewriteToFancy + + PartialEvaluate (which is just a synonym for Rewrite) + +- MiscCompilerPasses.v: Defines the passes: + + EliminateDead (dead code elimination) + + Subst01 (substitute let-binders used 0 or 1 times) + +- CStringification.v: conversion to C code as strings. (Depends on + AbstractInterpretation.v for ZRange utilities.) Defines the passes: + + ToString.ToFunctionLines + + ToString.ToFunctionString + + ToString.LinesToString + +- CompilersTestCases.v: Various test cases to ensure everything is working + +- BoundsPipeline.v: Assemble the various compiler passes together into + a composed pipeline. It is the final interface for the compiler. + Also contains some tactics for applying the BoundsPipeline + correctness lemma. + +- COperationSpecifications.v: The specifications for the various + operations to be synthesized. + +- PushButtonSynthesis/ReificationCache.v: Defines the cache that holds + reified versions of operations, as well as the tactics that reify + and apply things from the cache. + +- PushButtonSynthesis.v: Reifies the various operations from + Arithmetic.v, definies the compositions of the BoundsPipeline with + these operations, proves that their interpretations satisfies the + specs from COperationSpecifications.v, assembles the reified + post-bounds operations into synthesis targets. This is the file + that CLI.v depends on. + +- Toplevel2.v: Some not-quite-finished-but-kind-of-slow pipeline stuff + + all the stuff that uses compilers + arithmetic, together with more + examples. Also has semi-broken fancy-machine stuff. This should + probably be merged into Toplevel1.v when working on the pipeline. + +- SlowPrimeSynthesisExamples.v: Additional uses of the pipeline for + primes that are kind-of slow, which I don't want extraction blocking + on. + +- CLI.v: Setting up all of the language-independent parts of extraction; relies + on having a list of strings-or-error-messages for each pipeline, and on the + arguments to that pipeline, and builds a parser for command line arguments for + that. + +- StandaloneHaskellMain.v, StandaloneOCamlMain.v, ExtractionHaskell.v, + ExtractionOCaml.v: Extraction of pipeline to various languages + + +Proofs files: +For Language.v, there is a semi-arbitrary split between two files +`LanguageInversion` and `LanguageWf`. +- LanguageInversion.v: + + classifies equality of type codes and exprs + + type codes have decidable equality + + correctness of the various type-transport definitions + + correctness lemmas for the various `expr.invert_*` definitions + + correctness lemmas for the various `reify_*` definitions in Language.v + + inversion_type, which inverts equality of type codes + + type_beq_to_eq, which converts boolean equality of types to + Leibniz equality + + rewrite_type_transport_correct, which rewrites with the + correctness lemmas of the various type-transport definitions + + `type.invert_one e` which does case analysis on any inductive type + indexed over type codes, in a way that preserves information + about the type of `e`, and generally works even when the goal is + dependently typed over `e` and/or its type + + ident.invert, which does case-anaylsis on idents whose type has + structure (i.e., is not a var) + + ident.invert_match, which does case-analysis on idents appearing as + the discriminee of a `match` in the goal or in any hypothesis + + expr.invert, which does case-anaylsis on exprs whose type has + structure (i.e., is not a var) + + expr.invert_match, which does case-analysis on exprs appearing as + the discriminee of a `match` in the goal or in any hypothesis + + expr.invert_subst, which does case-analysis on exprs which show up + in hypotheses of the form `expr.invert_* _ = Some _` + + expr.inversion_expr, which inverts equalities of exprs + +- LanguageWf.v: Depends on LanguageInversion.v + Defines: + + expr.wf, expr.Wf, expr.wf3, expr.Wf3 + + GeneralizeVar.Flat.wf + + expr.inversion_wf (and variants), which invert `wf` hypotheses + + expr.wf_t (and variants wf_unsafe_t and wf_safe_t) which make + progress on `wf` goals; `wf_safe_t` should never turn a provable + goal into an unprovable one, while `wf_unsafe_t` might. + + expr.interp_t (and variants), which should make progress on + equivalence-of-interp hypotheses and goals, but is not used much + (mainly because I forgot I had defined it) + + prove_Wf, which proves wf goals on concrete syntax trees in a more + optimized way than `repeat constructor` + Proves: + + funext → (type.eqv ↔ Logic.eq) (`eqv_iff_eq_of_funext`) + + type.related and type.eqv are PERs + + Proper instances for type.app_curried, type.and_for_each_lhs_of_arrow + + type.is_not_higher_order → Reflexive (type.and_for_each_lhs_of_arrow type.eqv) + + iff between type.related{,_hetero} and related of type.app_curried + + various properties of type.and{,b_bool}for_each_lhs_of_arrow + + various properties of type.eqv and ident.{gen_,}interp + + various properties of ident.cast + + various properties of expr.wf (particularly of things defined in Language.v) + + interp and wf proofs for the passes to/from Flat + +- UnderLetsProofs.v: wf and interp lemmas for the various passes defined in UnderLets.v +- MiscCompilerPassesProofs.v: wf and interp lemmas for the various passes defined in MiscCompilerPasses.v +- GENERATEDIdentifiersWithoutTypesProofs.v: various lemmas about the definitions in GENERATEDIdentifiersWithoutTypes.v +- AbstractInterpretationZRangeProofs.v: Proves correctness lemmas of the per-operation zrange-bounds-analysis functions +- AbstractInterpretationWf.v: wf lemmas for the AbstractInterpretation pass +- AbstractInterpretationProofs.v: interp lemmas for the + AbstractInterpretation pass, and also correctness lemmas that + combine Wf and interp +- RewriterWf1.v: a semi-arbitrary collection of proofs and definitions + that are mostly shared between the wf proofs and the interp proofs. + Importantly, this file defines everything needed to state and prove + that specific rewrite rules are correct. +- RewriterRulesGood.v: Proves that all specific rewriter rules are + good enough for wf proofs (as specified in RewriterWf1.v) +- RewriterRulesInterpGood.v: Proves that all specific rewriter rules + are good enough for interp proofs (as specified in RewriterWf1.v) +- RewriterWf2.v: Proves wf of the generic rewriter pass, taking in + specific-rewriter-rule-correctness as a hypothesis +- RewriterInterpProofs1.v: Proves interp-correctness of the generic + rewriter pass, taking in specific-rewriter-rule-correctness as a + hypothesis +- RewriterProofs.v: assembles the proofs from RewriterRulesGood.v, + RewriterRulesInterpGood.v, RewriterWf2.v, and + RewriterInterpProofs1.v into correctness lemmas for the overall + rewriter pass, specialized and pre-evaluated for specific rewrite + rules + +There are also proofs about elliptic curves, for example: -Specifications live in [`src/Spec`](./src/Spec), e.g., [`src/Spec/Ed25519.v`](./src/Spec/Ed25519.v). -Some selected cuve-level proofs live in - [`src/Curves/Edwards/AffineProofs.v`](./src/Curves/Edwards/AffineProofs.v), - [`src/Curves/Edwards/XYZT/Basic.v`](./src/Curves/Edwards/XYZT/Basic.v), - [`src/Curves/Montgomery/AffineProofs.v`](./src/Curves/Montgomery/AffineProofs.v), - [`src/Curves/Montgomery/XZProofs.v`](src/Curves/Montgomery/XZProofs.v). - -### Arithmetic Core - -Generic mathematical optimization strategies live in [`src/Arithmetic`](./src/Arithmetic). -Good examples include -- [`src/Arithmetic/Core.v`](src/Arithmetic/Core.v), -- [`src/Arithmetic/Karatsuba.v`](src/Arithmetic/Karatsuba.v), -- [`src/Arithmetic/Saturated/Core.v`](./src/Arithmetic/Saturated/Core.v), -- [`src/Arithmetic/BarrettReduction/HAC.v`](./src/Arithmetic/BarrettReduction/HAC.v), -- [`src/Arithmetic/MontgomeryReduction/Definition.v`](./src/Arithmetic/MontgomeryReduction/Definition.v), -- [`src/Arithmetic/MontgomeryReduction/WordByWord/Abstract/Definition.v`](./src/Arithmetic/MontgomeryReduction/WordByWord/Abstract/Definition.v). - -### Demo of Synthesis - -The idea of the synthesis process is demoed in [`src/Demo.v`](./src/Demo.v). - -### Actual Synthesis - -- The curve-non-specific synthesis framework lives in [`src/Compilers/Z/Bounds/Pipeline`](./src/Compilers/Z/Bounds/Pipeline). -- The API for the reflective simplification and bounds analysis pipeline is in [`src/Compilers/Z/Bounds/Pipeline.v`](./src/Compilers/Z/Bounds/Pipeline.v). -- The definition of the pipeline and the correctness proof are in [`src/Compilers/Z/Bounds/Pipeline/Definition.v`](./src/Compilers/Z/Bounds/Pipeline/Definition.v). -- Generic reflective syntax trees are defined in [`src/Compilers/Syntax.v`](./src/Compilers/Syntax.v); the specialization to the arithmetic operations on Z that we use occurs in [`src/Compilers/Z/Syntax.v`](./src/Compilers/Z/Syntax.v). -- The trusted pretty-printer lives in [`src/Compilers/Z/CNotations.v`](./src/Compilers/Z/CNotations.v) and [`src/Compilers/Z/HexNotationConstants.v`](./src/Compilers/Z/HexNotationConstants.v), which were generated by the couple-hundred-line Python scripts in a comment near the top of each file. -- The core of the bounds-analysis transformation lives in [`src/Compilers/Z/Bounds/Interpretation.v`](./src/Compilers/Z/Bounds/Interpretation.v); the corresponding proofs live in [`src/Compilers/Z/Bounds/InterpretationLemmas`](./src/Compilers/Z/Bounds/InterpretationLemmas), and the instantation of the bounds-analysis transformation with the core lives in [`src/Compilers/Z/Bounds/MapCastByDeBruijn.v`](./src/Compilers/Z/Bounds/MapCastByDeBruijn.v). - - -The curve-specific synthesis framework lives in [`src/Specific/Framework`](./src/Specific/Framework). -- The input `.json` files live in [`src/Specific/CurveParameters`](./src/Specific/CurveParameters), and the processor for those files lives in [`src/Specific/Framework/make_curve.py`](./src/Specific/Framework/make_curve.py). -- The record of curve specific parameters is defined in [`src/Specific/Framework/RawCurveParameters.v`](./src/Specific/Framework/RawCurveParameters.v), and defaults are filled in [`src/Specific/Framework/CurveParameters.v`](src/Specific/Framework/CurveParameters.v). -- The tactics that specialize the various arithmetic optimizations to specific curves live in [`src/Specific/Framework/ArithmeticSynthesis`](./src/Specific/Framework/ArithmeticSynthesis), e.g., [`src/Specific/Framework/ArithmeticSynthesis/Montgomery.v`](./src/Specific/Framework/ArithmeticSynthesis/Montgomery.v). -- The script at [`src/Specific/Framework/ArithmeticSynthesis/remake_packages.py`](./src/Specific/Framework/ArithmeticSynthesis/remake_packages.py) processes these files to add relevant boilerplate, and gets run after significantly changing an optimization, or after adding a new optimization. -- The synthesis tactics are assembled in [`src/Specific/Framework/SynthesisFramework.v`](./src/Specific/Framework/SynthesisFramework.v). -Note that magic[¹](#magic) is used to create identifiers refering to the curve-specific parameters, from tactics, so that they can be used in other curve-specific files. - -### Example Output - -The c output lives in the various subfolders of [`src/Specific`](./src/Specific). For example, C-like code for X25519 multiplication on 64-bit machines lives in [`src/Specific/X25519/C64/femulDisplay.log`](./src/Specific/X25519/C64/femulDisplay.log), which is generated from [`src/Specific/X25519/C64/femulDisplay.v`](./src/Specific/X25519/C64/femulDisplay.v), which in turn uses synthesis run in [`src/Specific/X25519/C64/femul.v`](./src/Specific/X25519/C64/femul.v). The `c` target turns this into `src/Specific/X25519/C64/femul.c`. - -## Footnotes -[¹](#magic-ref) This magic comes in the form of calls to `transparent_abstract`, packaged up in some tactics in [`src/Util/Tactics/CacheTerm.v`](./src/Util/Tactics/CacheTerm.v). -- cgit v1.2.3