|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After some debugging with Andres, the issue seems to be that when we
have multiple TAPs, the first reduction will leave some of the partial
products past the end of the table, because it "folds over" the partial
products table at the location of each TAP. In the worst case, if we
have a TAP at 1 and a TAP at the second-highest limb, we will have to
reduce as many times as there are limbs. To prevent quadratic blowup,
we short-circuit the repeat_reduce calculation whenever the list of high
terms is nil. We deliberately do not support primes with a TAP in the
highest limb.
After | File Name | Before || Change | % Change
-----------------------------------------------------------------------------------------------------------
17m49.68s | Total | 17m34.83s || +0m14.84s | +1.40%
-----------------------------------------------------------------------------------------------------------
2m22.88s | PushButtonSynthesis/WordByWordMontgomeryReificationCache.vo | 2m21.60s || +0m01.28s | +0.90%
2m19.62s | Arithmetic.vo | 2m17.66s || +0m01.96s | +1.42%
0m34.02s | PushButtonSynthesis/UnsaturatedSolinasReificationCache.vo | 0m32.24s || +0m01.78s | +5.52%
0m32.06s | ExtractionHaskell/unsaturated_solinas | 0m30.99s || +0m01.07s | +3.45%
0m19.95s | SlowPrimeSynthesisExamples.vo | 0m18.40s || +0m01.55s | +8.42%
3m12.02s | p384_32.c | 3m12.03s || -0m00.00s | -0.00%
1m02.14s | Fancy/Montgomery256.vo | 1m02.09s || +0m00.04s | +0.08%
0m51.24s | Fancy/Barrett256.vo | 0m51.28s || -0m00.03s | -0.07%
0m45.72s | ExtractionHaskell/word_by_word_montgomery | 0m45.05s || +0m00.67s | +1.48%
0m38.67s | p521_32.c | 0m38.24s || +0m00.42s | +1.12%
0m31.94s | p521_64.c | 0m31.38s || +0m00.56s | +1.78%
0m24.85s | ExtractionHaskell/saturated_solinas | 0m24.45s || +0m00.40s | +1.63%
0m23.53s | PushButtonSynthesis/MontgomeryReductionReificationCache.vo | 0m22.79s || +0m00.74s | +3.24%
0m21.96s | PushButtonSynthesis/WordByWordMontgomery.vo | 0m21.88s || +0m00.08s | +0.36%
0m19.60s | PushButtonSynthesis/UnsaturatedSolinas.vo | 0m19.56s || +0m00.04s | +0.20%
0m17.76s | ExtractionOCaml/word_by_word_montgomery | 0m17.26s || +0m00.50s | +2.89%
0m14.90s | p448_solinas_64.c | 0m14.91s || -0m00.00s | -0.06%
0m13.55s | secp256k1_32.c | 0m13.39s || +0m00.16s | +1.19%
0m13.31s | p256_32.c | 0m13.06s || +0m00.25s | +1.91%
0m13.30s | PushButtonSynthesis/BarrettReductionReificationCache.vo | 0m12.44s || +0m00.86s | +6.91%
0m11.47s | p484_64.c | 0m11.31s || +0m00.16s | +1.41%
0m11.43s | ExtractionOCaml/unsaturated_solinas | 0m11.15s || +0m00.27s | +2.51%
0m10.55s | ExtractionOCaml/word_by_word_montgomery.ml | 0m10.66s || -0m00.10s | -1.03%
0m08.42s | ExtractionOCaml/saturated_solinas | 0m08.24s || +0m00.17s | +2.18%
0m07.07s | ExtractionOCaml/unsaturated_solinas.ml | 0m07.04s || +0m00.03s | +0.42%
0m06.66s | COperationSpecifications.vo | 0m06.31s || +0m00.35s | +5.54%
0m06.50s | ExtractionHaskell/word_by_word_montgomery.hs | 0m06.39s || +0m00.11s | +1.72%
0m06.19s | p224_32.c | 0m05.99s || +0m00.20s | +3.33%
0m05.35s | p384_64.c | 0m05.24s || +0m00.10s | +2.09%
0m05.26s | ExtractionOCaml/saturated_solinas.ml | 0m05.24s || +0m00.01s | +0.38%
0m05.24s | ExtractionHaskell/unsaturated_solinas.hs | 0m05.17s || +0m00.07s | +1.35%
0m04.23s | ExtractionHaskell/saturated_solinas.hs | 0m04.12s || +0m00.11s | +2.66%
0m04.14s | PushButtonSynthesis/SaturatedSolinasReificationCache.vo | 0m03.89s || +0m00.24s | +6.42%
0m03.68s | PushButtonSynthesis/Primitives.vo | 0m03.62s || +0m00.06s | +1.65%
0m03.58s | PushButtonSynthesis/SmallExamples.vo | 0m03.30s || +0m00.28s | +8.48%
0m03.49s | PushButtonSynthesis/SaturatedSolinas.vo | 0m03.25s || +0m00.24s | +7.38%
0m02.38s | curve25519_32.c | 0m02.28s || +0m00.10s | +4.38%
0m01.66s | PushButtonSynthesis/BarrettReduction.vo | 0m01.39s || +0m00.27s | +19.42%
0m01.47s | curve25519_64.c | 0m01.42s || +0m00.05s | +3.52%
0m01.34s | CLI.vo | 0m01.45s || -0m00.10s | -7.58%
0m01.32s | PushButtonSynthesis/MontgomeryReduction.vo | 0m01.30s || +0m00.02s | +1.53%
0m01.08s | secp256k1_64.c | 0m01.04s || +0m00.04s | +3.84%
0m01.07s | StandaloneOCamlMain.vo | 0m01.10s || -0m00.03s | -2.72%
0m01.04s | StandaloneHaskellMain.vo | 0m01.12s || -0m00.08s | -7.14%
0m01.04s | p224_64.c | 0m01.14s || -0m00.09s | -8.77%
0m01.00s | p256_64.c | 0m00.98s || +0m00.02s | +2.04%
|
|
We port the computation of the carry chain from generate_parameters.py
to Coq, for unsaturated solinas.
Note that while we now bounds-check p448, we do not yet support
goldilocks nor karatsuba.
However, there is still an issue with the synthesized p448 code, which
is that on 64-bit, it tries to use 256-bit and 512-bit integers. I'm
not sure what's up with that.
Partial progress towards #507
After | File Name | Before || Change | % Change
------------------------------------------------------------------------------------------
8m51.64s | Total | 8m34.60s || +0m17.04s | +3.31%
------------------------------------------------------------------------------------------
0m15.16s | p448_solinas_64.c | N/A || +0m15.16s | ∞
3m09.12s | p384_32.c | 3m09.36s || -0m00.24s | -0.12%
0m44.99s | ExtractionHaskell/word_by_word_montgomery | 0m44.91s || +0m00.08s | +0.17%
0m39.58s | p521_32.c | 0m39.22s || +0m00.35s | +0.91%
0m32.54s | p521_64.c | 0m32.49s || +0m00.04s | +0.15%
0m30.87s | ExtractionHaskell/unsaturated_solinas | 0m31.04s || -0m00.16s | -0.54%
0m24.31s | ExtractionHaskell/saturated_solinas | 0m24.32s || -0m00.01s | -0.04%
0m18.62s | PushButtonSynthesis/UnsaturatedSolinas.vo | 0m17.90s || +0m00.72s | +4.02%
0m17.53s | ExtractionOCaml/word_by_word_montgomery | 0m17.44s || +0m00.08s | +0.51%
0m13.36s | secp256k1_32.c | 0m13.58s || -0m00.22s | -1.62%
0m13.21s | p256_32.c | 0m13.15s || +0m00.06s | +0.45%
0m11.47s | p484_64.c | 0m11.39s || +0m00.08s | +0.70%
0m11.27s | ExtractionOCaml/unsaturated_solinas | 0m10.71s || +0m00.55s | +5.22%
0m10.48s | ExtractionOCaml/word_by_word_montgomery.ml | 0m10.34s || +0m00.14s | +1.35%
0m07.97s | ExtractionOCaml/saturated_solinas | 0m07.98s || -0m00.01s | -0.12%
0m07.05s | ExtractionOCaml/unsaturated_solinas.ml | 0m06.98s || +0m00.06s | +1.00%
0m06.58s | ExtractionHaskell/word_by_word_montgomery.hs | 0m06.48s || +0m00.09s | +1.54%
0m06.09s | p224_32.c | 0m06.04s || +0m00.04s | +0.82%
0m05.24s | p384_64.c | 0m05.34s || -0m00.09s | -1.87%
0m05.13s | ExtractionOCaml/saturated_solinas.ml | 0m05.19s || -0m00.06s | -1.15%
0m05.00s | ExtractionHaskell/unsaturated_solinas.hs | 0m04.98s || +0m00.01s | +0.40%
0m04.14s | ExtractionHaskell/saturated_solinas.hs | 0m04.04s || +0m00.09s | +2.47%
0m02.22s | curve25519_32.c | 0m02.22s || +0m00.00s | +0.00%
0m01.49s | curve25519_64.c | 0m01.53s || -0m00.04s | -2.61%
0m01.46s | CLI.vo | 0m01.44s || +0m00.02s | +1.38%
0m01.29s | SlowPrimeSynthesisExamples.vo | 0m01.24s || +0m00.05s | +4.03%
0m01.08s | p256_64.c | 0m01.00s || +0m00.08s | +8.00%
0m01.06s | StandaloneOCamlMain.vo | 0m00.96s || +0m00.10s | +10.41%
0m01.06s | secp256k1_64.c | 0m01.17s || -0m00.10s | -9.40%
0m01.02s | p224_64.c | 0m01.08s || -0m00.06s | -5.55%
0m00.99s | StandaloneHaskellMain.vo | 0m01.08s || -0m00.09s | -8.33%
0m00.27s | TAPSort.vo | N/A || +0m00.27s | ∞
|