Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor FIPS/SHA2 APIs #40

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

thomwiggers
Copy link
Contributor

  • Use abstract types for SHA2/SHAKE state
    This allows to more easily replace SHA2/SHAKE implementations
  • Define 'free'-style functions for hash state
    This allows potential heap-based SHA2/SHAKE implementations to instantiate those functions in SPHINCS+

Updates fips202.c/sha2.c implementations based on those found in PQClean (which are based on a common ancestor anyway).

@thomwiggers thomwiggers marked this pull request as draft August 29, 2022 11:42
@thomwiggers
Copy link
Contributor Author

This is still work in progress, the accelerated implementations need to be updated as well.

@bwesterb
Copy link
Contributor

bwesterb commented Aug 29, 2022

This is still work in progress, the accelerated implementations need to be updated as well.

We can merge something like this for ref, but I don’t see this working for the optimized implementations. Abstraction doesn’t come for free. (Of course I’m happy to be proven wrong.)

@thomwiggers
Copy link
Contributor Author

I don’t see this working for the optimized implementations. Abstraction doesn’t come for free. (Of course I’m happy to be proven wrong.)

PQClean doesn't include optimized versions of the hash functions (yet?) --- down the line we may consider seeing if we can extract e.g. keccakx4 as it's used by both Dilithium and SPHINCS+ --- so the changes I'm making there are mostly just making it compatible with the abstract x1 API but leaving the xN APIs in place.

sha2-avx2/context_sha2.c Outdated Show resolved Hide resolved
@thomwiggers thomwiggers marked this pull request as ready for review August 29, 2022 14:50
@thomwiggers
Copy link
Contributor Author

thomwiggers commented Aug 29, 2022

Everything builds so it seems mostly ready for merging on my end.

The diff is slightly on the large side, but the main things I'm doing is moving non-hashscheme-specific code out of e.g. sha2.c and using the opaque types for the hash functions in the code (because it's a struct this means inserting a bunch of &s). The sha2-avx2 implementation required some careful massaging because the sha2 times {4,8} uses in thash open the internals of the sha2 x1 hash. I've made a note about the main concern there.

@bwesterb
Copy link
Contributor

bwesterb commented Sep 1, 2022

mostly ready

You want a review?

@thomwiggers
Copy link
Contributor Author

Yep, please let me know what you think or if there's any things that could better be refactored into a different file or something like that.

@thomwiggers
Copy link
Contributor Author

Merging this set of patches would make it a lot easier to go forward with the downstream updates in PQClean and liboqs. I would appreciate it if it could get some eyes.

If it helps I can probably try to clarify what kind of changes I made, let me know if that'd help.

ref/context_sha2.c Outdated Show resolved Hide resolved
@bwesterb
Copy link
Contributor

bwesterb commented Oct 3, 2022

@thomwiggers Can you do before/after benchmarks? I’ll take a closer look this week. (I have some more time now :). )

@thomwiggers
Copy link
Contributor Author

I've ran the benchmarks, and comparing them side-by-side using diff -y -W300 benchmark-* suggests any changes are within the expected margin of error.

benchmark-master.log
benchmark-refactor.log

@thomwiggers
Copy link
Contributor Author

Hi Bas, gentle nudge 🙂

@bwesterb
Copy link
Contributor

You add a lot of dead code, eg. sha3_384.

I'm not so sure that the benchmarks show it doesn't have any impact. Let me double check.

@bwesterb
Copy link
Contributor

bwesterb commented Oct 19, 2022

There seems to be a small but significant difference for signing and keygen with shake.

sphincs-shake-128f simple using ref Verifying
 old: [12577888, 12668944, 12147936, 12112128, 12000800, 12747280, 14791232, 11850624, 17213168, 12543744, 12465600, 12406160, 12115920, 12898560, 12199104, 12385904, 12315552, 11966912, 12241024, 12143456]
 new: [12307552, 12582416, 12450480, 12408416, 12303968, 12623264, 12840528, 12129152, 12597920, 12838000, 12151808, 12747392, 12539392, 12601984, 12633200, 12390176, 12344272, 12636096, 12608944, 12952272]
MannwhitneyuResult(statistic=138.0, pvalue=0.09619629624919565)

sphincs-shake-128f simple using ref Generating keypair
 old: [8971552, 8847728, 8843552, 8847616, 8844800, 8844336, 10836288, 8874784, 12378704, 8883888, 8845584, 8845744, 8847760, 8845008, 8853264, 9010496, 9007040, 8846048, 8919184, 8845200]
 new: [8982768, 9010416, 9011648, 8993232, 9014112, 9024832, 9015984, 9025792, 9071072, 9014416, 9001344, 9005040, 9011456, 8981200, 9048320, 9038080, 9036768, 8997712, 9011744, 9009120]
MannwhitneyuResult(statistic=54.0, pvalue=8.292416774489331e-05)

sphincs-shake-128f simple using ref Signing
 old: [207647008, 208268784, 207233744, 207437888, 207113808, 207405936, 253702368, 207998672, 207223376, 207132784, 207258832, 207104928, 207268144, 207546304, 207112176, 207123552, 207161440, 207574352, 207365344, 211052848]
 new: [210912880, 211561600, 211226128, 210940992, 210904768, 210958304, 212347840, 211005088, 211137712, 211298528, 210949360, 210905984, 210982624, 211360928, 212491568, 211020560, 210874032, 216942864, 210265360, 211109584]
MannwhitneyuResult(statistic=31.0, pvalue=5.1657788244971414e-06)

@bwesterb
Copy link
Contributor

I'm not quite sure where the difference comes from. For the ref this isn't too bad though. Let me check some of the optimized implementations.

@bwesterb
Copy link
Contributor

Yup, also some small but significant slowdowns:

sphincs-sha2-128s simple using sha2-avx2 Signing
1.130% difference with p=0.00018
 old: [1067314720, 1066514816, 1066383520, 1066477008, 1066873056, 1066228848, 1076929952, 1067130848, 1069597024, 1066460864]
 new: [1079695824, 1078724240, 1078808432, 1078364912, 1079615680, 1078522368, 1078753168, 1083423696, 1084852528, 1079814320]

sphincs-sha2-128s simple using sha2-avx2 Generating keypair
1.110% difference with p=0.00077
 old: [140582752, 140479616, 140460704, 140497584, 140583744, 140567744, 140562880, 142128736, 140990848, 140542080]
 new: [142161616, 142063712, 142046704, 142448528, 142103952, 142081040, 142070496, 142752704, 142984448, 142307136]

sphincs-sha2-128s robust using sha2-avx2 Generating keypair
0.494% difference with p=0.00018
 old: [275520288, 275458128, 275506576, 275728592, 275210352, 276160192, 275787024, 275554944, 276436240, 275492448]
 new: [276996240, 276752832, 276820944, 277197856, 277023648, 277008304, 276832480, 276953280, 277129264, 277758608]

sphincs-sha2-128s robust using sha2-avx2 Signing
0.478% difference with p=0.00018
 old: [2094632048, 2094203632, 2094167872, 2094465776, 2093855536, 2101201968, 2095646544, 2097507664, 2100464048, 2094692000]
 new: [2105579184, 2103967760, 2104281296, 2109052464, 2105628480, 2105386384, 2104183104, 2105478704, 2106668816, 2110747344]

sphincs-sha2-128f robust using sha2-avx2 Generating keypair
0.407% difference with p=0.00018
 old: [4338288, 4336752, 4324496, 4325920, 4334912, 4333760, 4335184, 4338208, 4330304, 4323888]
 new: [4342096, 4353264, 4342272, 4349232, 4350336, 4361248, 4359216, 4348320, 4342528, 4349568]

sphincs-sha2-128f simple using sha2-avx2 Generating keypair
0.561% difference with p=0.00025
 old: [2207968, 2207488, 2208640, 2206816, 2208144, 2208048, 2206688, 2204720, 2215296, 2206848]
 new: [2213088, 2222752, 2218768, 2225264, 2219104, 2218640, 2224816, 2217856, 2224736, 2219504]

sphincs-sha2-128f simple using sha2-avx2 Signing
0.581% difference with p=0.00025
 old: [51860048, 51856928, 51879776, 51836432, 51794624, 51863568, 51851152, 51815360, 52109760, 51853008]
 new: [51990080, 52266048, 52133440, 52248496, 52142368, 52144048, 52260752, 52150576, 52252208, 52146256]

sphincs-sha2-128f robust using sha2-avx2 Signing
0.398% difference with p=0.00018
 old: [102369776, 102285424, 102190512, 102179632, 102187344, 102137056, 102175376, 102325424, 102169648, 102089952]
 new: [102454624, 102648896, 102468496, 102619280, 102581360, 102860544, 102854736, 102590128, 102507200, 102588736]

@thomwiggers
Copy link
Contributor Author

Cleaning up the dead code is obviously something that we can easily do (and makes sense to do). We should also probably clear up the thing Douglas pointed out; I was just waiting on a bit of feedback so I could batch the polishing work.

Those differences in the hashing performance are interesting though. It's funny that it only seems to be happening (or was only statistically significant?) in ref's shake-128f-simple. That suggests to me that it's not being caused by the added indirection, as that should affect all schemes equally. If it's strictly in shake, then I would also naively expect more things to be affected.

Meanwhile in SHA2-AVX2, the most significant code changes seem to be the addition of the x8 version of the seeded state and the new memcpy of the x8/x4 state, rather than setting up a new state by transforming and copying the x1 state using _mm256_set_epi64x. The other changes don't really seem like they should have much effect. Then again, maybe something dumb like marking transpose as static leads to less efficient compiler choices.

Can you share the script you wrote to generate those statistical comparisons? I should probably measure this again with a few different compilers rather than just GCC 12, and in a cleaner environment than a development VM.

@bwesterb
Copy link
Contributor

Can you share the script you wrote to generate those statistical comparisons?

https://gist.github.com/bwesterb/db8083608aeb4161021a60eeeb84fe71

Pipe the output of benchmark.py to old and new after applying this patch:

diff --git a/benchmark.py b/benchmark.py
index 4ae6003..26c6659 100755
--- a/benchmark.py
+++ b/benchmark.py
@@ -32,8 +31,9 @@ for impl, fns in implementations:
                 stdout=DEVNULL, stderr=sys.stderr)
             run(["make", "-C", impl, "benchmarks", thash, params],
                 stdout=DEVNULL, stderr=sys.stderr)
-            run(["make", "-C", impl, "benchmark", thash, params],
-                stdout=sys.stdout, stderr=sys.stderr)
+            for i in range(10):
+                run(["make", "-C", impl, "benchmark", thash, params],
+                    stdout=sys.stdout, stderr=sys.stderr)
 
             print(flush=True)

I pinned CPU frequencies to something low and disabled turbo boost for accurate results.

@thomwiggers
Copy link
Contributor Author

thomwiggers commented Nov 23, 2022

I've run a lot more benchmarks, with 50 measurements instead of 10, and running them for Clang and GCC with and without LTO. I'm finally getting around to looking at the results that I generated ages ago. I'm finding the following interesting results:

master vs master comparisons

GCC vs GCC with LTO

Up to 17% faster, though one or two impl become 2% slower.

sphincs-haraka-192f robust using haraka-aesni Verifying     -16.156% difference with p=0.00000
sphincs-haraka-128f robust using haraka-aesni Verifying     -16.432% difference with p=0.00000
sphincs-haraka-128s robust using haraka-aesni Verifying     -16.570% difference with p=0.00000
sphincs-haraka-192s robust using haraka-aesni Verifying     -16.371% difference with p=0.00000
sphincs-sha2-192f robust using sha2-avx2 Signing            -15.832% difference with p=0.00000
sphincs-sha2-192s robust using sha2-avx2 Signing            -15.192% difference with p=0.00000
sphincs-haraka-192s robust using haraka-aesni Signing       -14.736% difference with p=0.00000
sphincs-haraka-192f robust using haraka-aesni Signing       -14.748% difference with p=0.00000
sphincs-haraka-128f robust using haraka-aesni Signing       -14.747% difference with p=0.00000
sphincs-sha2-192f robust using sha2-avx2 Verifying          -13.249% difference with p=0.00000
sphincs-haraka-128f simple using haraka-aesni Verifying     -10.897% difference with p=0.00000
sphincs-haraka-128s robust using haraka-aesni Signing       -13.593% difference with p=0.00000
sphincs-haraka-256s robust using haraka-aesni Verifying     -12.824% difference with p=0.00000
sphincs-sha2-192s robust using sha2-avx2 Verifying          -11.275% difference with p=0.00000
sphincs-sha2-256s robust using sha2-avx2 Signing            -11.302% difference with p=0.00000
sphincs-sha2-256f robust using sha2-avx2 Signing            -11.218% difference with p=0.00000
sphincs-sha2-128s robust using sha2-avx2 Signing            -10.720% difference with p=0.00000
sphincs-sha2-128f robust using sha2-avx2 Signing            -10.670% difference with p=0.00000
sphincs-haraka-256s robust using haraka-aesni Signing       -10.294% difference with p=0.00000
sphincs-haraka-128f simple using haraka-aesni Signing       -10.245% difference with p=0.00000
sphincs-haraka-128s simple using haraka-aesni Verifying     -10.272% difference with p=0.00000
sphincs-haraka-256f robust using haraka-aesni Verifying     -9.673% difference with p=0.00000
sphincs-sha2-256f robust using sha2-avx2 Verifying          -9.546% difference with p=0.00000
sphincs-sha2-128s robust using sha2-avx2 Verifying          -9.447% difference with p=0.00000
sphincs-sha2-128f robust using sha2-avx2 Verifying          -8.728% difference with p=0.00000
sphincs-sha2-256f simple using sha2-avx2 Verifying          -7.192% difference with p=0.00000
sphincs-sha2-256f simple using sha2-avx2 Signing            -8.458% difference with p=0.00000
sphincs-sha2-256s robust using sha2-avx2 Verifying          -9.177% difference with p=0.00000
sphincs-sha2-256s simple using sha2-avx2 Signing            -8.155% difference with p=0.00000
sphincs-sha2-192f simple using sha2-avx2 Signing            -7.693% difference with p=0.00000
sphincs-sha2-192s simple using sha2-avx2 Signing            -7.515% difference with p=0.00000
sphincs-haraka-128s simple using haraka-aesni Signing       -6.980% difference with p=0.00000
sphincs-haraka-192s simple using haraka-aesni Verifying     -6.185% difference with p=0.00000
sphincs-sha2-192f simple using sha2-avx2 Verifying          -6.900% difference with p=0.00000
sphincs-haraka-192s simple using haraka-aesni Signing       -6.462% difference with p=0.00000
sphincs-sha2-128f simple using sha2-avx2 Signing            -6.341% difference with p=0.00000
sphincs-shake-256f robust using ref Verifying               -5.789% difference with p=0.00000
sphincs-sha2-128s simple using sha2-avx2 Signing            -6.023% difference with p=0.00000
sphincs-haraka-256s simple using haraka-aesni Signing       -6.158% difference with p=0.00000
sphincs-haraka-256f robust using haraka-aesni Signing       -5.821% difference with p=0.00000
sphincs-sha2-256s simple using sha2-avx2 Verifying          -6.058% difference with p=0.00000
sphincs-shake-256f robust using ref Signing                 -5.754% difference with p=0.00000
sphincs-sha2-128f simple using sha2-avx2 Verifying          -5.865% difference with p=0.00000
sphincs-haraka-192f simple using haraka-aesni Signing       -5.477% difference with p=0.00000
sphincs-shake-128s robust using ref Verifying               -4.646% difference with p=0.00000
sphincs-haraka-256f simple using haraka-aesni Signing       -5.380% difference with p=0.00000
sphincs-haraka-192f simple using haraka-aesni Verifying     -5.704% difference with p=0.00000
sphincs-sha2-256s robust using ref Verifying                -3.204% difference with p=0.00000
sphincs-shake-256s robust using ref Signing                 -5.165% difference with p=0.00000
sphincs-sha2-192f robust using ref Verifying                -4.803% difference with p=0.00000
sphincs-haraka-256s simple using haraka-aesni Verifying     -3.249% difference with p=0.00000
sphincs-sha2-192s robust using ref Signing                  -4.821% difference with p=0.00000
sphincs-sha2-192f robust using ref Signing                  -4.815% difference with p=0.00000
sphincs-shake-192f robust using ref Verifying               -4.490% difference with p=0.00000
sphincs-shake-128f robust using ref Signing                 -4.571% difference with p=0.00000
sphincs-sha2-192s simple using sha2-avx2 Verifying          -5.361% difference with p=0.00000
sphincs-shake-128s simple using shake-avx2 Verifying        -2.132% difference with p=0.00000
sphincs-shake-192f robust using ref Signing                 -4.446% difference with p=0.00000
sphincs-sha2-128s robust using ref Signing                  -4.402% difference with p=0.00000
sphincs-sha2-192s robust using ref Verifying                -4.740% difference with p=0.00000
sphincs-shake-128s robust using ref Signing                 -4.363% difference with p=0.00000
sphincs-shake-128f robust using ref Verifying               -4.547% difference with p=0.00000
sphincs-shake-256s robust using ref Verifying               -4.969% difference with p=0.00000
sphincs-shake-192s robust using ref Signing                 -4.180% difference with p=0.00000
sphincs-sha2-256f robust using ref Verifying                -2.124% difference with p=0.00000
sphincs-sha2-256s robust using ref Signing                  -3.965% difference with p=0.00000
sphincs-sha2-128s simple using sha2-avx2 Verifying          -5.233% difference with p=0.00000
sphincs-shake-256s simple using ref Signing                 -3.780% difference with p=0.00000
sphincs-shake-128s simple using ref Signing                 -3.695% difference with p=0.00000
sphincs-shake-192s simple using ref Signing                 -3.622% difference with p=0.00000
sphincs-shake-256s simple using shake-avx2 Signing          -3.528% difference with p=0.00000
sphincs-haraka-128f simple using ref Verifying              -1.709% difference with p=0.00002
sphincs-shake-192s simple using shake-avx2 Signing          -3.515% difference with p=0.00000
sphincs-shake-192f simple using ref Signing                 -3.410% difference with p=0.00000
sphincs-shake-256f simple using ref Signing                 -3.393% difference with p=0.00000
sphincs-shake-192f robust using shake-avx2 Verifying        -1.806% difference with p=0.00000
sphincs-shake-128f simple using ref Signing                 -3.270% difference with p=0.00000
sphincs-shake-192f simple using ref Verifying               -3.278% difference with p=0.00000
sphincs-shake-256f simple using shake-avx2 Signing          -3.189% difference with p=0.00000
sphincs-shake-192s simple using ref Verifying               -2.674% difference with p=0.00000
sphincs-shake-192f simple using shake-avx2 Signing          -2.936% difference with p=0.00000
sphincs-shake-192s robust using shake-avx2 Verifying        -2.816% difference with p=0.00000
sphincs-haraka-256f simple using haraka-aesni Verifying     -1.314% difference with p=0.00003
sphincs-shake-256s robust using shake-avx2 Verifying        -3.659% difference with p=0.00000
sphincs-shake-192f simple using shake-avx2 Verifying        -2.537% difference with p=0.00000
sphincs-sha2-128f robust using ref Verifying                -1.846% difference with p=0.00004
sphincs-shake-256f simple using ref Verifying               -3.245% difference with p=0.00000
sphincs-shake-192s robust using ref Verifying               -4.607% difference with p=0.00000
sphincs-sha2-128s robust using ref Verifying                -5.086% difference with p=0.00000
sphincs-shake-256f robust using shake-avx2 Verifying        -2.957% difference with p=0.00000
sphincs-shake-256s simple using shake-avx2 Verifying        -2.684% difference with p=0.00000
sphincs-shake-256s simple using ref Verifying               -3.219% difference with p=0.00000
sphincs-sha2-256f robust using ref Signing                  -2.385% difference with p=0.00000
sphincs-shake-128s simple using ref Verifying               -2.918% difference with p=0.00007
sphincs-haraka-128s simple using ref Verifying              -2.053% difference with p=0.00060
sphincs-sha2-256s simple using ref Signing                  -2.060% difference with p=0.00000
sphincs-sha2-192f simple using ref Signing                  -1.969% difference with p=0.00000
sphincs-shake-128s simple using shake-avx2 Signing          -1.988% difference with p=0.00000
sphincs-shake-192s simple using shake-avx2 Verifying        -3.085% difference with p=0.00000
sphincs-sha2-128f robust using ref Signing                  -1.962% difference with p=0.00000
sphincs-haraka-128f simple using ref Signing                -2.018% difference with p=0.00000
sphincs-shake-256f simple using shake-avx2 Verifying        -3.194% difference with p=0.00000
sphincs-shake-128f simple using shake-avx2 Verifying        -1.378% difference with p=0.00003
sphincs-haraka-256f simple using ref Signing                -1.872% difference with p=0.00000
sphincs-haraka-128s simple using ref Signing                -1.711% difference with p=0.00000
sphincs-shake-128f robust using shake-avx2 Verifying        -1.959% difference with p=0.00000
sphincs-haraka-256s simple using ref Verifying              -1.607% difference with p=0.00001
sphincs-shake-256f robust using shake-avx2 Signing          -1.623% difference with p=0.00000
sphincs-shake-256s robust using shake-avx2 Signing          -1.606% difference with p=0.00000
sphincs-shake-128f simple using shake-avx2 Signing          -1.603% difference with p=0.00000
sphincs-shake-128f robust using shake-avx2 Signing          -1.346% difference with p=0.00000
sphincs-shake-128s robust using shake-avx2 Verifying        -2.654% difference with p=0.00000
sphincs-haraka-256f simple using ref Verifying              -1.981% difference with p=0.00000
sphincs-sha2-128f simple using ref Signing                  +1.316% difference with p=0.00000
sphincs-haraka-256s simple using ref Signing                -1.205% difference with p=0.00000
sphincs-haraka-192f simple using ref Verifying              -1.416% difference with p=0.00000
sphincs-sha2-128s simple using ref Signing                  +0.961% difference with p=0.00000
sphincs-haraka-192s robust using ref Signing                +0.879% difference with p=0.00000
sphincs-sha2-192f simple using ref Verifying                -2.340% difference with p=0.00000
sphincs-shake-192s robust using shake-avx2 Signing          -0.805% difference with p=0.00000
sphincs-haraka-192s simple using ref Signing                -0.810% difference with p=0.00000
sphincs-haraka-192f robust using ref Signing                +0.662% difference with p=0.00000
sphincs-sha2-256s simple using ref Verifying                -1.329% difference with p=0.00576
sphincs-haraka-192f simple using ref Signing                -0.706% difference with p=0.00000
sphincs-shake-192f robust using shake-avx2 Signing          -0.667% difference with p=0.00000
sphincs-sha2-192s simple using ref Signing                  +0.566% difference with p=0.00000
sphincs-sha2-256f simple using ref Signing                  +0.514% difference with p=0.00000
sphincs-haraka-256s robust using ref Signing                -0.476% difference with p=0.00000
sphincs-shake-128f simple using ref Verifying               -2.882% difference with p=0.00000
sphincs-haraka-128s robust using ref Signing                -0.404% difference with p=0.00000
sphincs-shake-128s robust using shake-avx2 Signing          -0.336% difference with p=0.00000
sphincs-sha2-256f simple using ref Verifying                +1.378% difference with p=0.00001
sphincs-haraka-128f robust using ref Signing                -0.289% difference with p=0.00000
sphincs-haraka-256f robust using ref Signing                -0.107% difference with p=0.00000

Clang vs GCC

up to 29% performance difference either way.

sphincs-sha2-256f robust using sha2-avx2 Signing            -29.155% difference with p=0.00000
sphincs-sha2-256s robust using sha2-avx2 Signing            -28.820% difference with p=0.00000
sphincs-sha2-256f simple using sha2-avx2 Signing            -23.448% difference with p=0.00000
sphincs-sha2-256s simple using sha2-avx2 Signing            -22.882% difference with p=0.00000
sphincs-sha2-192f simple using sha2-avx2 Signing            -22.409% difference with p=0.00000
sphincs-sha2-192s simple using sha2-avx2 Signing            -22.247% difference with p=0.00000
sphincs-sha2-192f robust using sha2-avx2 Signing            -20.820% difference with p=0.00000
sphincs-sha2-256f robust using sha2-avx2 Verifying          -21.445% difference with p=0.00000
sphincs-sha2-192s robust using sha2-avx2 Signing            -20.176% difference with p=0.00000
sphincs-sha2-128f simple using sha2-avx2 Signing            -19.654% difference with p=0.00000
sphincs-sha2-128s simple using sha2-avx2 Signing            -19.292% difference with p=0.00000
sphincs-sha2-128s robust using sha2-avx2 Signing            -19.289% difference with p=0.00000
sphincs-sha2-256s robust using sha2-avx2 Verifying          -18.230% difference with p=0.00000
sphincs-sha2-128f robust using sha2-avx2 Signing            -19.059% difference with p=0.00000
sphincs-sha2-128f simple using ref Verifying                +17.471% difference with p=0.00000
sphincs-haraka-128f simple using haraka-aesni Verifying     -16.486% difference with p=0.00000
sphincs-shake-128f robust using shake-avx2 Signing          -17.504% difference with p=0.00000
sphincs-shake-128s robust using shake-avx2 Signing          -17.202% difference with p=0.00000
sphincs-sha2-128f simple using ref Signing                  +17.064% difference with p=0.00000
sphincs-sha2-128s simple using ref Signing                  +16.646% difference with p=0.00000
sphincs-haraka-128s simple using haraka-aesni Verifying     -15.886% difference with p=0.00000
sphincs-shake-128s simple using shake-avx2 Signing          -16.430% difference with p=0.00000
sphincs-sha2-256f simple using ref Verifying                +16.674% difference with p=0.00000
sphincs-shake-192f simple using shake-avx2 Signing          -16.203% difference with p=0.00000
sphincs-sha2-256f simple using ref Signing                  +16.471% difference with p=0.00000
sphincs-shake-192s simple using shake-avx2 Signing          -15.859% difference with p=0.00000
sphincs-shake-128f simple using shake-avx2 Signing          -16.146% difference with p=0.00000
sphincs-shake-256f robust using shake-avx2 Signing          -16.252% difference with p=0.00000
sphincs-sha2-256s simple using ref Verifying                +13.360% difference with p=0.00000
sphincs-shake-192s robust using shake-avx2 Signing          -16.046% difference with p=0.00000
sphincs-shake-192f robust using shake-avx2 Verifying        -15.425% difference with p=0.00000
sphincs-sha2-192f robust using ref Verifying                +14.754% difference with p=0.00000
sphincs-sha2-128s robust using ref Verifying                +14.730% difference with p=0.00000
sphincs-shake-256s robust using shake-avx2 Signing          -15.758% difference with p=0.00000
sphincs-shake-256s simple using shake-avx2 Signing          -15.706% difference with p=0.00000
sphincs-shake-192f robust using shake-avx2 Signing          -15.653% difference with p=0.00000
sphincs-sha2-192s robust using ref Signing                  +15.554% difference with p=0.00000
sphincs-shake-256f simple using shake-avx2 Signing          -15.524% difference with p=0.00000
sphincs-sha2-128f robust using ref Signing                  +15.423% difference with p=0.00000
sphincs-shake-192f simple using shake-avx2 Verifying        -14.784% difference with p=0.00000
sphincs-sha2-192s simple using ref Signing                  +15.452% difference with p=0.00000
sphincs-shake-256f robust using shake-avx2 Verifying        -15.386% difference with p=0.00000
sphincs-sha2-192f robust using ref Signing                  +15.203% difference with p=0.00000
sphincs-shake-128s simple using shake-avx2 Verifying        -13.542% difference with p=0.00000
sphincs-shake-256s simple using shake-avx2 Verifying        -15.516% difference with p=0.00000
sphincs-shake-256s robust using shake-avx2 Verifying        -15.211% difference with p=0.00000
sphincs-sha2-256f robust using ref Verifying                +16.174% difference with p=0.00000
sphincs-sha2-256f robust using ref Signing                  +14.786% difference with p=0.00000
sphincs-sha2-128s simple using ref Verifying                +15.174% difference with p=0.00000
sphincs-sha2-192f simple using ref Signing                  +14.650% difference with p=0.00000
sphincs-sha2-192s simple using ref Verifying                +15.789% difference with p=0.00000
sphincs-haraka-256s robust using haraka-aesni Signing       -13.618% difference with p=0.00000
sphincs-shake-128f robust using shake-avx2 Verifying        -15.678% difference with p=0.00000
sphincs-shake-128s robust using shake-avx2 Verifying        -15.639% difference with p=0.00000
sphincs-sha2-192f simple using ref Verifying                +14.080% difference with p=0.00000
sphincs-sha2-128s robust using ref Signing                  +14.259% difference with p=0.00000
sphincs-shake-128f simple using shake-avx2 Verifying        -14.093% difference with p=0.00000
sphincs-shake-256f simple using shake-avx2 Verifying        -14.722% difference with p=0.00000
sphincs-shake-256f robust using ref Signing                 -13.849% difference with p=0.00000
sphincs-sha2-192f simple using sha2-avx2 Verifying          -13.363% difference with p=0.00000
sphincs-sha2-192s robust using ref Verifying                +15.358% difference with p=0.00000
sphincs-sha2-128f robust using ref Verifying                +15.112% difference with p=0.00000
sphincs-shake-256f robust using ref Verifying               -13.405% difference with p=0.00000
sphincs-sha2-256f simple using sha2-avx2 Verifying          -12.671% difference with p=0.00000
sphincs-haraka-128s simple using haraka-aesni Signing       -13.165% difference with p=0.00000
sphincs-haraka-192f simple using haraka-aesni Verifying     -12.705% difference with p=0.00000
sphincs-sha2-256s simple using ref Signing                  +13.192% difference with p=0.00000
sphincs-shake-128s robust using ref Verifying               -11.580% difference with p=0.00000
sphincs-shake-192s simple using shake-avx2 Verifying        -14.455% difference with p=0.00000
sphincs-shake-192s robust using shake-avx2 Verifying        -14.636% difference with p=0.00000
sphincs-shake-256s robust using ref Verifying               -12.357% difference with p=0.00000
sphincs-shake-256s robust using ref Signing                 -12.756% difference with p=0.00000
sphincs-shake-192s robust using ref Verifying               -10.830% difference with p=0.00000
sphincs-shake-256f simple using ref Verifying               -12.062% difference with p=0.00000
sphincs-shake-256f simple using ref Signing                 -12.285% difference with p=0.00000
sphincs-haraka-192f robust using haraka-aesni Signing       +12.524% difference with p=0.00000
sphincs-shake-128s simple using ref Signing                 -12.112% difference with p=0.00000
sphincs-shake-192f robust using ref Verifying               -11.395% difference with p=0.00000
sphincs-shake-256s simple using ref Verifying               -12.618% difference with p=0.00000
sphincs-shake-256s simple using ref Signing                 -11.897% difference with p=0.00000
sphincs-sha2-256s robust using ref Signing                  +11.816% difference with p=0.00000
sphincs-haraka-128f simple using haraka-aesni Signing       -11.463% difference with p=0.00000
sphincs-shake-128f simple using ref Signing                 -11.545% difference with p=0.00000
sphincs-sha2-192f robust using sha2-avx2 Verifying          -11.082% difference with p=0.00000
sphincs-shake-128f robust using ref Signing                 -11.396% difference with p=0.00000
sphincs-shake-192s simple using ref Signing                 -11.297% difference with p=0.00000
sphincs-shake-192f simple using ref Signing                 -11.277% difference with p=0.00000
sphincs-shake-128s robust using ref Signing                 -11.263% difference with p=0.00000
sphincs-shake-192f robust using ref Signing                 -11.083% difference with p=0.00000
sphincs-sha2-256s robust using ref Verifying                +12.360% difference with p=0.00000
sphincs-haraka-256s robust using haraka-aesni Verifying     -9.435% difference with p=0.00000
sphincs-shake-192s robust using ref Signing                 -10.900% difference with p=0.00000
sphincs-haraka-192s robust using haraka-aesni Signing       +11.393% difference with p=0.00000
sphincs-haraka-256f robust using haraka-aesni Signing       -9.327% difference with p=0.00000
sphincs-shake-192f simple using ref Verifying               -11.238% difference with p=0.00000
sphincs-shake-192s simple using ref Verifying               -10.762% difference with p=0.00000
sphincs-haraka-192s simple using haraka-aesni Verifying     -10.811% difference with p=0.00000
sphincs-shake-128f robust using ref Verifying               -10.680% difference with p=0.00000
sphincs-sha2-128f simple using sha2-avx2 Verifying          -10.377% difference with p=0.00000
sphincs-sha2-128f robust using sha2-avx2 Verifying          -10.411% difference with p=0.00000
sphincs-haraka-192f simple using haraka-aesni Signing       -9.739% difference with p=0.00000
sphincs-haraka-128s robust using haraka-aesni Verifying     -8.903% difference with p=0.00000
sphincs-haraka-128f robust using haraka-aesni Verifying     -8.200% difference with p=0.00000
sphincs-haraka-256f robust using haraka-aesni Verifying     -9.505% difference with p=0.00000
sphincs-shake-128s simple using ref Verifying               -10.687% difference with p=0.00000
sphincs-haraka-256s simple using haraka-aesni Verifying     -8.803% difference with p=0.00000
sphincs-haraka-192s simple using haraka-aesni Signing       -8.511% difference with p=0.00000
sphincs-haraka-128f robust using haraka-aesni Signing       -7.956% difference with p=0.00000
sphincs-shake-128f simple using ref Verifying               -11.547% difference with p=0.00000
sphincs-haraka-192f robust using haraka-aesni Verifying     +10.278% difference with p=0.00000
sphincs-haraka-256f simple using haraka-aesni Verifying     -8.515% difference with p=0.00000
sphincs-haraka-256f simple using ref Verifying              +7.185% difference with p=0.00000
sphincs-haraka-128s robust using haraka-aesni Signing       -7.340% difference with p=0.00000
sphincs-haraka-128s simple using ref Signing                +7.818% difference with p=0.00000
sphincs-haraka-256s simple using ref Signing                +7.781% difference with p=0.00000
sphincs-sha2-256s simple using sha2-avx2 Verifying          -7.872% difference with p=0.00000
sphincs-haraka-256s simple using haraka-aesni Signing       -6.736% difference with p=0.00000
sphincs-haraka-192s simple using ref Verifying              +6.738% difference with p=0.00000
sphincs-sha2-128s robust using sha2-avx2 Verifying          -7.861% difference with p=0.00000
sphincs-haraka-256f simple using haraka-aesni Signing       -6.695% difference with p=0.00000
sphincs-haraka-128f simple using ref Signing                +6.985% difference with p=0.00000
sphincs-haraka-256f simple using ref Signing                +6.833% difference with p=0.00000
sphincs-haraka-192s simple using ref Signing                +6.398% difference with p=0.00000
sphincs-haraka-128s simple using ref Verifying              +6.122% difference with p=0.00000
sphincs-haraka-192f simple using ref Verifying              +5.877% difference with p=0.00000
sphincs-haraka-192s robust using haraka-aesni Verifying     +6.817% difference with p=0.00000
sphincs-haraka-192f simple using ref Signing                +6.152% difference with p=0.00000
sphincs-haraka-256s simple using ref Verifying              +6.868% difference with p=0.00000
sphincs-haraka-128f simple using ref Verifying              +6.776% difference with p=0.00000
sphincs-sha2-192s simple using sha2-avx2 Verifying          -6.281% difference with p=0.00000
sphincs-sha2-192s robust using sha2-avx2 Verifying          -4.722% difference with p=0.00000
sphincs-sha2-128s simple using sha2-avx2 Verifying          -5.587% difference with p=0.00000
sphincs-haraka-128f robust using ref Verifying              -3.037% difference with p=0.00000
sphincs-haraka-128f robust using ref Signing                -2.764% difference with p=0.00000
sphincs-haraka-192f robust using ref Verifying              -2.471% difference with p=0.00000
sphincs-haraka-128s robust using ref Signing                -2.035% difference with p=0.00000
sphincs-haraka-256f robust using ref Signing                -1.808% difference with p=0.00000
sphincs-haraka-256s robust using ref Verifying              -1.293% difference with p=0.00055
sphincs-haraka-256f robust using ref Verifying              -2.213% difference with p=0.00000
sphincs-haraka-192f robust using ref Signing                -1.305% difference with p=0.00000
sphincs-haraka-128s robust using ref Verifying              -0.921% difference with p=0.01561
sphincs-haraka-192s robust using ref Signing                -0.659% difference with p=0.00000
sphincs-haraka-256s robust using ref Signing                +0.148% difference with p=0.00000

master vs hash-api-refactor

GCC vs GCC

Huh, -16% on something that didn't really change?

sphincs-haraka-128s robust using haraka-aesni Verifying     -16.530% difference with p=0.00000
sphincs-haraka-192s robust using haraka-aesni Verifying     -16.365% difference with p=0.00000
sphincs-haraka-128f robust using haraka-aesni Verifying     -16.429% difference with p=0.00000
sphincs-haraka-192f robust using haraka-aesni Verifying     -16.151% difference with p=0.00000
sphincs-haraka-192f robust using haraka-aesni Signing       -15.273% difference with p=0.00000
sphincs-haraka-128f robust using haraka-aesni Signing       -14.761% difference with p=0.00000
sphincs-haraka-192s robust using haraka-aesni Signing       -14.751% difference with p=0.00000
sphincs-haraka-128f simple using haraka-aesni Verifying     -12.239% difference with p=0.00000
sphincs-haraka-128s robust using haraka-aesni Signing       -13.600% difference with p=0.00000
sphincs-haraka-256s robust using haraka-aesni Verifying     -12.973% difference with p=0.00000
sphincs-haraka-128f simple using haraka-aesni Signing       -11.386% difference with p=0.00000
sphincs-haraka-256s robust using haraka-aesni Signing       -10.302% difference with p=0.00000
sphincs-haraka-128s simple using haraka-aesni Verifying     -10.089% difference with p=0.00000
sphincs-haraka-256f robust using haraka-aesni Verifying     -9.802% difference with p=0.00000
sphincs-haraka-128s simple using haraka-aesni Signing       -6.964% difference with p=0.00000
sphincs-haraka-192s simple using haraka-aesni Signing       -6.500% difference with p=0.00000
sphincs-haraka-256s simple using haraka-aesni Signing       -6.205% difference with p=0.00000
sphincs-haraka-256f robust using haraka-aesni Signing       -5.784% difference with p=0.00000
sphincs-haraka-192f simple using haraka-aesni Signing       -5.459% difference with p=0.00000
sphincs-haraka-256f simple using haraka-aesni Signing       -5.399% difference with p=0.00000
sphincs-haraka-192f simple using haraka-aesni Verifying     -5.869% difference with p=0.00000
sphincs-haraka-192s simple using haraka-aesni Verifying     -6.028% difference with p=0.00000
sphincs-haraka-256s simple using haraka-aesni Verifying     -3.197% difference with p=0.00000
sphincs-shake-256s robust using shake-avx2 Verifying        +2.237% difference with p=0.00000
sphincs-haraka-128s simple using ref Verifying              -1.001% difference with p=0.03626
sphincs-sha2-256f simple using sha2-avx2 Signing            +2.727% difference with p=0.00000
sphincs-sha2-192s simple using ref Signing                  +2.540% difference with p=0.00000
sphincs-shake-256s robust using ref Verifying               +1.772% difference with p=0.00038
sphincs-sha2-128s simple using ref Signing                  +2.451% difference with p=0.00000
sphincs-haraka-256f simple using haraka-aesni Verifying     -1.217% difference with p=0.00036
sphincs-sha2-192s robust using ref Verifying                +1.528% difference with p=0.00157
sphincs-sha2-128s simple using ref Verifying                +1.459% difference with p=0.03967
sphincs-sha2-128s simple using sha2-avx2 Signing            +2.192% difference with p=0.00000
sphincs-sha2-128f robust using sha2-avx2 Verifying          +1.024% difference with p=0.00134
sphincs-sha2-192s simple using sha2-avx2 Verifying          +0.837% difference with p=0.00294
sphincs-sha2-128f simple using ref Signing                  +2.056% difference with p=0.00000
sphincs-sha2-256s simple using sha2-avx2 Verifying          +1.143% difference with p=0.00010
sphincs-shake-192s robust using ref Verifying               +1.033% difference with p=0.03572
sphincs-sha2-192f robust using ref Verifying                -0.807% difference with p=0.00522
sphincs-sha2-192s robust using ref Signing                  +1.339% difference with p=0.00000
sphincs-sha2-128s simple using sha2-avx2 Verifying          +1.823% difference with p=0.00021
sphincs-shake-128f simple using shake-avx2 Signing          +1.290% difference with p=0.00000
sphincs-sha2-128f simple using sha2-avx2 Verifying          +1.590% difference with p=0.00002
sphincs-sha2-128f simple using sha2-avx2 Signing            +1.361% difference with p=0.00000
sphincs-sha2-256f simple using sha2-avx2 Verifying          +1.727% difference with p=0.00000
sphincs-shake-256s robust using ref Signing                 +1.180% difference with p=0.00000
sphincs-sha2-192s simple using ref Verifying                +3.110% difference with p=0.00000
sphincs-sha2-192s simple using sha2-avx2 Signing            +1.097% difference with p=0.00000
sphincs-sha2-256f simple using ref Verifying                +1.185% difference with p=0.00015
sphincs-sha2-256f robust using ref Verifying                +1.445% difference with p=0.00001
sphincs-sha2-192s robust using sha2-avx2 Signing            +0.984% difference with p=0.00000
sphincs-sha2-192f simple using sha2-avx2 Verifying          +0.944% difference with p=0.00016
sphincs-sha2-192f simple using sha2-avx2 Signing            +0.965% difference with p=0.00000
sphincs-sha2-128s robust using sha2-avx2 Signing            +1.007% difference with p=0.00000
sphincs-sha2-256s simple using sha2-avx2 Signing            +0.923% difference with p=0.00000
sphincs-sha2-128f robust using sha2-avx2 Signing            +0.716% difference with p=0.00000
sphincs-haraka-128f robust using ref Verifying              -0.839% difference with p=0.02188
sphincs-shake-256s simple using ref Verifying               -1.084% difference with p=0.01310
sphincs-sha2-192f robust using sha2-avx2 Signing            +0.722% difference with p=0.00000
sphincs-sha2-256f simple using ref Signing                  +0.782% difference with p=0.00000
sphincs-shake-128f robust using shake-avx2 Signing          -0.661% difference with p=0.00000
sphincs-sha2-192f robust using ref Signing                  -0.710% difference with p=0.00000
sphincs-sha2-256f robust using ref Signing                  +0.684% difference with p=0.00000
sphincs-shake-192s robust using ref Signing                 +0.582% difference with p=0.00000
sphincs-shake-128s robust using shake-avx2 Signing          -0.455% difference with p=0.00000
sphincs-sha2-192f simple using ref Signing                  -0.448% difference with p=0.00000
sphincs-sha2-128f robust using ref Signing                  -0.432% difference with p=0.00000
sphincs-sha2-128s robust using ref Signing                  +0.441% difference with p=0.00000
sphincs-shake-128f simple using shake-avx2 Verifying        +0.835% difference with p=0.00463
sphincs-shake-128s robust using ref Signing                 +0.367% difference with p=0.00000
sphincs-sha2-256s simple using ref Signing                  +0.339% difference with p=0.00000
sphincs-shake-256f simple using ref Signing                 +0.329% difference with p=0.00000
sphincs-shake-128f robust using ref Signing                 +0.309% difference with p=0.00000
sphincs-shake-128s simple using ref Signing                 -0.319% difference with p=0.00000
sphincs-shake-256s simple using shake-avx2 Signing          -0.305% difference with p=0.00000
sphincs-sha2-256f robust using sha2-avx2 Verifying          +0.479% difference with p=0.04087
sphincs-shake-256f simple using shake-avx2 Signing          -0.300% difference with p=0.00000
sphincs-shake-256f robust using shake-avx2 Signing          -0.424% difference with p=0.00000
sphincs-shake-192f robust using shake-avx2 Signing          +0.304% difference with p=0.00000
sphincs-shake-256f robust using ref Signing                 -0.205% difference with p=0.00000
sphincs-shake-256s robust using shake-avx2 Signing          -0.119% difference with p=0.00000
sphincs-haraka-128s simple using ref Signing                -0.226% difference with p=0.00000
sphincs-haraka-192f robust using ref Signing                +0.141% difference with p=0.00000
sphincs-haraka-192s robust using ref Signing                +0.184% difference with p=0.00000
sphincs-shake-192f simple using ref Signing                 +0.176% difference with p=0.00000
sphincs-haraka-192s simple using ref Signing                -0.191% difference with p=0.00000
sphincs-sha2-192s robust using sha2-avx2 Verifying          +0.749% difference with p=0.00755
sphincs-shake-128s simple using shake-avx2 Signing          -0.108% difference with p=0.00000
sphincs-shake-128f simple using ref Signing                 +0.101% difference with p=0.00000
sphincs-shake-192f simple using shake-avx2 Signing          +0.156% difference with p=0.00000
sphincs-shake-192s simple using ref Signing                 +0.065% difference with p=0.00000
sphincs-shake-192f robust using ref Signing                 +0.122% difference with p=0.00000
sphincs-haraka-256f robust using ref Signing                +0.014% difference with p=0.02533
sphincs-sha2-256s robust using sha2-avx2 Signing            +0.010% difference with p=0.00006
sphincs-shake-192s robust using shake-avx2 Signing          +0.039% difference with p=0.00001
sphincs-sha2-192f robust using sha2-avx2 Verifying          +0.349% difference with p=0.04027
sphincs-haraka-256s robust using ref Signing                -0.080% difference with p=0.00000
sphincs-haraka-192f simple using ref Signing                -0.115% difference with p=0.00002
sphincs-shake-256s simple using ref Signing                 -0.081% difference with p=0.00000
sphincs-haraka-128f robust using ref Signing                -0.099% difference with p=0.00000
sphincs-sha2-256s robust using ref Signing                  +0.035% difference with p=0.00000
sphincs-shake-192s simple using shake-avx2 Signing          -0.056% difference with p=0.00037
sphincs-haraka-256f simple using ref Signing                -0.084% difference with p=0.00088
sphincs-haraka-128s robust using ref Signing                -0.074% difference with p=0.00000

Unfortunately, I don't have further results for new vs. old for Clang as it appears my crappy script seems to have failed halfway through those measurements.

Though I'm not ready to come to any conclusions, I have the slight suspicion that code layout in memory may be a non-trivial influence here. The Haraka results are especially puzzling though. I am also seeing quite a few more results which have differences like you reported; I wonder if this is an artifact of the CPU in ygritte which is trickier to disable all frequency scaling on.

ref/fips202.h Outdated Show resolved Hide resolved
@bwesterb
Copy link
Contributor

Though I'm not ready to come to any conclusions, I have the slight suspicion that code layout in memory may be a non-trivial influence here.

Yes, this could very well be true.

I'm now leaning towards merging something like this PR. However, the diff is too big and contains a lot of noise: for instance, I don't really care that you renamed sha512ctx4x to sha512x4ctx, but it adds one more thing to keep in mind. Could you separate it into easily reviewable chunks?

@thomwiggers
Copy link
Contributor Author

I'm trying to sort out the changes into separate commits as much as possible. Hopefully that will help.

* Use abstract types for SHA2 state
  This allows to more easily replace SHA2 implementations
* Define 'free'-style functions for hash state
  This allows potential heap-based SHA2 implementations to instantiate
  sha2 in SPHINCS+
* Uses opaque incremental hashing context
  Allows easier replacement of hashing primitives by different backing
  implementations.
* Adds context release functions
  Allows heap-backed FIPS202 implementations.

fips202.[ch] from PQClean
@thomwiggers
Copy link
Contributor Author

thomwiggers commented Nov 24, 2022

I think I did as much as I could do without re-applying changes by hand. You should now be able to review the individual commits and they should mostly make sense on their own. Let me know if you really would like some stuff split out.

I can also continue working on the integration into PQClean before we merge this PR; that will result in applying lots and lots of more testing on a bunch of different platforms, linters and compilers. While that will lead to more changes (notably MSVC is REALLY picky about implicit downcasting) it will also confirm if everything is working.

@bwesterb
Copy link
Contributor

Could you rebase? Let's see if the builds succeed then.

@bwesterb
Copy link
Contributor

Also, it seems you didn't restructure the patch set yet for easy review. Right?

@thomwiggers
Copy link
Contributor Author

I did, but I found a few things in the process of preparing the PQClean versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants