| Commit message (Collapse) | Author | Age |
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Major change is porting everything to Z and using Z.div_mod_to_quot_rem
which is a handy sledgehammer. Z is also a nice simplification. Dealing
with subtraction is tidier, though I do have 0 <= x goals everywhere as
a result.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This variant comes from
http://www.ridiculousfish.com/blog/posts/labor-of-division-episode-i.html.
It was useful for
https://boringssl-review.googlesource.com/#/c/boringssl/+/25887.
TODO - Talk to Andres to figure out all the ways this could be done more
cleanly. It was originally a standalone file.
|
| |
|
| |
|
|
|
|
| |
One of them is Admitted.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It now lives in Arithmetic.Core.B.Positional, where it belongs, rather
than in Specific/.../HelperTactics.
Andres notes that we probably don't need this at all, and could instead
make chained_carries reduce after every index (and the spurious
reductions should be no-ops). I didn't want to bother verifying this,
at the moment, so I left it as-is.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After | File Name | Before || Change | % Change
--------------------------------------------------------------------------------------------------------
10m04.41s | Total | 10m03.68s || +0m00.73s | +0.12%
--------------------------------------------------------------------------------------------------------
3m04.06s | Specific/X25519/C64/ladderstep | 3m04.00s || +0m00.06s | +0.03%
1m48.23s | Specific/NISTP256/AMD64/femul | 1m48.30s || -0m00.07s | -0.06%
0m38.59s | Arithmetic/Karatsuba | 0m38.51s || +0m00.08s | +0.20%
0m22.50s | Specific/X25519/C64/femul | 0m23.15s || -0m00.64s | -2.80%
0m21.79s | Specific/NISTP256/AMD64/fesub | 0m21.76s || +0m00.02s | +0.13%
0m19.78s | Specific/NISTP256/AMD64/feadd | 0m19.94s || -0m00.16s | -0.80%
0m18.57s | Specific/X25519/C64/freeze | 0m18.57s || +0m00.00s | +0.00%
0m17.96s | Specific/X25519/C64/fesquare | 0m18.36s || -0m00.39s | -2.17%
0m16.54s | Specific/NISTP256/AMD64/feopp | 0m16.18s || +0m00.35s | +2.22%
0m14.58s | Specific/X25519/C64/fecarry | 0m14.76s || -0m00.17s | -1.21%
0m13.80s | Specific/NISTP256/AMD64/fenz | 0m13.84s || -0m00.03s | -0.28%
0m13.34s | Specific/X25519/C64/fesub | 0m13.57s || -0m00.23s | -1.69%
0m12.72s | Arithmetic/Saturated/AddSub | 0m12.74s || -0m00.01s | -0.15%
0m12.19s | Specific/X25519/C64/feadd | 0m12.48s || -0m00.29s | -2.32%
0m10.57s | Arithmetic/Saturated/MontgomeryAPI | 0m10.36s || +0m00.21s | +2.02%
0m10.17s | Arithmetic/Saturated/Core | 0m09.92s || +0m00.25s | +2.52%
0m06.47s | Specific/NISTP256/AMD64/Synthesis | 0m06.65s || -0m00.18s | -2.70%
0m06.02s | Arithmetic/Saturated/MulSplit | 0m05.87s || +0m00.14s | +2.55%
0m05.27s | Specific/X25519/C64/Synthesis | 0m05.45s || -0m00.18s | -3.30%
0m03.70s | Arithmetic/MontgomeryReduction/WordByWord/Proofs | 0m03.65s || +0m00.05s | +1.36%
0m03.66s | Specific/Framework/ArithmeticSynthesis/Montgomery | 0m03.70s || -0m00.04s | -1.08%
0m02.93s | Specific/Framework/ArithmeticSynthesis/Defaults | 0m02.60s || +0m00.33s | +12.69%
0m02.42s | Arithmetic/Saturated/Freeze | 0m02.45s || -0m00.03s | -1.22%
0m01.57s | Arithmetic/CoreUnfolder | 0m01.52s || +0m00.05s | +3.28%
0m01.44s | Specific/Framework/ArithmeticSynthesis/HelperTactics | 0m01.11s || +0m00.32s | +29.72%
0m01.29s | Specific/Framework/ArithmeticSynthesis/Karatsuba | 0m01.21s || +0m00.08s | +6.61%
0m01.19s | Specific/Framework/ArithmeticSynthesis/Base | 0m01.00s || +0m00.18s | +18.99%
0m01.04s | Arithmetic/Saturated/CoreUnfolder | 0m01.13s || -0m00.08s | -7.96%
0m01.00s | Arithmetic/Saturated/UniformWeight | 0m00.95s || +0m00.05s | +5.26%
0m00.92s | Arithmetic/Saturated/WrappersUnfolder | 0m00.96s || -0m00.03s | -4.16%
0m00.89s | Specific/Framework/SynthesisFramework | 0m01.00s || -0m00.10s | -10.99%
0m00.87s | Specific/Framework/ReificationTypes | 0m01.00s || -0m00.13s | -13.00%
0m00.84s | Specific/Framework/MontgomeryReificationTypesPackage | 0m00.82s || +0m00.02s | +2.43%
0m00.81s | Specific/Framework/ArithmeticSynthesis/MontgomeryPackage | 0m00.72s || +0m00.09s | +12.50%
0m00.80s | Specific/Framework/ArithmeticSynthesis/Freeze | 0m00.86s || -0m00.05s | -6.97%
0m00.78s | Arithmetic/MontgomeryReduction/WordByWord/Definition | 0m00.76s || +0m00.02s | +2.63%
0m00.77s | Specific/Framework/MontgomeryReificationTypes | 0m00.80s || -0m00.03s | -3.75%
0m00.76s | Arithmetic/Saturated/MulSplitUnfolder | 0m00.80s || -0m00.04s | -5.00%
0m00.76s | Specific/Framework/ArithmeticSynthesis/DefaultsPackage | 0m00.70s || +0m00.06s | +8.57%
0m00.75s | Specific/Framework/ArithmeticSynthesis/KaratsubaPackage | 0m00.64s || +0m00.10s | +17.18%
0m00.75s | Specific/Framework/ArithmeticSynthesis/LadderstepPackage | 0m00.71s || +0m00.04s | +5.63%
0m00.74s | Arithmetic/Saturated/Wrappers | 0m00.78s || -0m00.04s | -5.12%
0m00.73s | Arithmetic/Saturated/FreezeUnfolder | 0m00.77s || -0m00.04s | -5.19%
0m00.72s | Specific/Framework/ArithmeticSynthesis/SquareFromMul | 0m00.72s || +0m00.00s | +0.00%
0m00.72s | Specific/Framework/ReificationTypesPackage | 0m00.84s || -0m00.12s | -14.28%
0m00.71s | Specific/Framework/ArithmeticSynthesis/FreezePackage | 0m00.69s || +0m00.02s | +2.89%
0m00.70s | Specific/Framework/ArithmeticSynthesis/Ladderstep | 0m00.67s || +0m00.02s | +4.47%
0m00.69s | Specific/Framework/ArithmeticSynthesis/BasePackage | 0m01.03s || -0m00.34s | -33.00%
0m00.66s | Arithmetic/Saturated/UniformWeightInstances | 0m00.67s || -0m00.01s | -1.49%
0m15.65s | Arithmetic/Core | 0m14.01s || +0m01.64s | +11.70%
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is in preparation for writing a ~compiler for the arithmetic things
to expression trees.
I'm not sure what's up with femul in the table below; I ran it again and
got:
After:
src/Specific/NISTP256/AMD64/femul (real: 115.70, user: 115.25, sys: 0.44, mem: 3571448 ko)
Before:
src/Specific/NISTP256/AMD64/femul (real: 118.49, user: 117.99, sys: 0.43, mem: 3581612 ko)
After | File Name | Before || Change
---------------------------------------------------------------------------------------------
17m02.82s | Total | 16m36.20s || +0m26.61s
---------------------------------------------------------------------------------------------
2m27.04s | Specific/NISTP256/AMD64/femul | 2m04.60s || +0m22.43s
1m38.55s | Specific/X2448/Karatsuba/C64/femul | 1m41.44s || -0m02.89s
0m12.46s | Arithmetic/Saturated/AddSub | 0m09.77s || +0m02.69s
3m22.38s | Specific/X25519/C64/ladderstep | 3m23.49s || -0m01.11s
0m54.40s | Specific/X25519/C32/fesquare | 0m52.68s || +0m01.71s
0m28.70s | Arithmetic/Karatsuba | 0m27.59s || +0m01.10s
0m10.00s | Arithmetic/Saturated/MontgomeryAPI | 0m08.95s || +0m01.05s
0m08.15s | Specific/X2448/Karatsuba/C64/Synthesis | 0m09.47s || -0m01.32s
0m05.62s | Arithmetic/Saturated/MulSplit | 0m04.28s || +0m01.33s
1m29.44s | Specific/X25519/C32/femul | 1m28.55s || +0m00.89s
0m39.38s | Specific/X25519/C32/freeze | 0m38.62s || +0m00.76s
0m31.54s | Specific/NISTP256/AMD128/femul | 0m31.60s || -0m00.06s
0m24.80s | Specific/X25519/C64/femul | 0m24.10s || +0m00.69s
0m23.82s | Specific/NISTP256/AMD64/fesub | 0m23.52s || +0m00.30s
0m21.81s | Specific/NISTP256/AMD64/feadd | 0m21.90s || -0m00.08s
0m20.30s | Specific/X25519/C64/freeze | 0m20.26s || +0m00.03s
0m20.12s | Specific/X25519/C32/Synthesis | 0m20.77s || -0m00.64s
0m19.12s | Specific/X25519/C64/fesquare | 0m19.02s || +0m00.10s
0m17.28s | Specific/NISTP256/AMD64/feopp | 0m17.68s || -0m00.39s
0m15.99s | Specific/NISTP256/AMD128/fesub | 0m16.03s || -0m00.04s
0m15.88s | Specific/NISTP256/AMD128/feadd | 0m16.56s || -0m00.67s
0m15.03s | Specific/NISTP256/AMD64/fenz | 0m15.00s || +0m00.02s
0m14.18s | Specific/NISTP256/AMD128/fenz | 0m14.12s || +0m00.06s
0m13.46s | Specific/NISTP256/AMD128/feopp | 0m12.88s || +0m00.58s
0m12.15s | Arithmetic/Core | 0m12.03s || +0m00.12s
0m07.82s | Arithmetic/Saturated/Core | 0m07.05s || +0m00.77s
0m07.13s | Specific/NISTP256/AMD64/Synthesis | 0m08.05s || -0m00.92s
0m05.48s | Specific/X25519/C64/Synthesis | 0m05.68s || -0m00.19s
0m04.02s | Specific/Framework/ArithmeticSynthesis/Montgomery | 0m03.89s || +0m00.12s
0m03.52s | Arithmetic/MontgomeryReduction/WordByWord/Proofs | 0m03.34s || +0m00.18s
0m03.32s | Specific/NISTP256/AMD128/Synthesis | 0m03.46s || -0m00.14s
0m02.30s | Specific/Framework/ArithmeticSynthesis/Defaults | 0m02.31s || -0m00.01s
0m02.08s | Arithmetic/Saturated/Freeze | 0m01.94s || +0m00.14s
0m01.66s | Specific/Framework/OutputType | 0m01.66s || +0m00.00s
0m01.54s | Arithmetic/CoreUnfolder | 0m01.43s || +0m00.11s
0m01.35s | Specific/Framework/ArithmeticSynthesis/Karatsuba | 0m01.28s || +0m00.07s
0m01.13s | Arithmetic/Saturated/CoreUnfolder | 0m01.16s || -0m00.03s
0m01.06s | Arithmetic/Saturated/WrappersUnfolder | 0m01.04s || +0m00.02s
0m01.04s | Arithmetic/Saturated/UniformWeight | 0m00.95s || +0m00.09s
0m01.03s | Specific/Framework/ArithmeticSynthesis/Base | 0m01.14s || -0m00.10s
0m01.02s | Specific/Framework/SynthesisFramework | 0m01.04s || -0m00.02s
0m00.97s | Specific/Framework/ArithmeticSynthesis/HelperTactics | 0m01.01s || -0m00.04s
0m00.92s | Specific/Framework/ReificationTypes | 0m00.90s || +0m00.02s
0m00.92s | Specific/Framework/ArithmeticSynthesis/Freeze | 0m00.93s || -0m00.01s
0m00.90s | Arithmetic/Saturated/MulSplitUnfolder | 0m00.83s || +0m00.07s
0m00.83s | Specific/Framework/ReificationTypesPackage | 0m00.79s || +0m00.03s
0m00.83s | Arithmetic/Saturated/FreezeUnfolder | 0m00.86s || -0m00.03s
0m00.82s | Specific/Framework/ArithmeticSynthesis/BasePackage | 0m00.77s || +0m00.04s
0m00.81s | Specific/Framework/ArithmeticSynthesis/SquareFromMul | 0m00.72s || +0m00.09s
0m00.81s | Specific/Framework/ArithmeticSynthesis/LadderstepPackage | 0m00.82s || -0m00.00s
0m00.80s | Specific/Framework/MontgomeryReificationTypesPackage | 0m00.82s || -0m00.01s
0m00.78s | Specific/Framework/ArithmeticSynthesis/MontgomeryPackage | 0m00.79s || -0m00.01s
0m00.78s | Arithmetic/Saturated/Wrappers | 0m00.78s || +0m00.00s
0m00.76s | Specific/Framework/ArithmeticSynthesis/FreezePackage | 0m00.80s || -0m00.04s
0m00.76s | Specific/Framework/ArithmeticSynthesis/DefaultsPackage | 0m00.75s || +0m00.01s
0m00.75s | Specific/Framework/MontgomeryReificationTypes | 0m00.78s || -0m00.03s
0m00.73s | Specific/Framework/ArithmeticSynthesis/Ladderstep | 0m00.77s || -0m00.04s
0m00.73s | Arithmetic/MontgomeryReduction/WordByWord/Definition | 0m00.80s || -0m00.07s
0m00.72s | Arithmetic/Saturated/UniformWeightInstances | 0m00.78s || -0m00.06s
0m00.68s | Specific/Framework/ArithmeticSynthesis/KaratsubaPackage | 0m00.76s || -0m00.07s
0m00.43s | Util/ZUtil/CPS | 0m00.42s || +0m00.01s
|
|
|
|
| |
Also unfold some cps things
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Now there's a version that handles things in Saturated.Core, and in
Wrappers.
|
| |
|
|
|
|
| |
This way, we can reuse it even when we can't fully compute the values
|
| |
|
|
|
|
| |
This way, files importing Core don't have to keep track of the list of runtime operations, for unfoling.
|
| |
|
| |
|
| |
|
| |
|
| |
|