Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on choice of algorithm #19

Open
bakkot opened this issue Dec 4, 2022 · 3 comments
Open

Thoughts on choice of algorithm #19

bakkot opened this issue Dec 4, 2022 · 3 comments

Comments

@bakkot
Copy link

bakkot commented Dec 4, 2022

There's a variety of modern PRNG algorithms we could choose from, with various tradeoffs around speed, memory, predictability, complexity, miscellaneous statistical properties, and also amenability to other things you might want to do like jumping ahead.

I think the main reasonable candidates are ChaCha, PCG or a variant, and Xoshiro/Xoroshiro. Honorable mention to sfc64 and SplitMix64. Wyrand looks interesting but I think it's too new to be the default choice. Dishonorable mention to Mersenne twisters, which require an excessive amount of state with basically no compensating advantages.

Note that Xoshiro/Xoroshiro is an entire family of PRNGs, so we'd have to choose among them; some of the older ones have (what I regard as) notable weaknesses.

Of these I'd lean towards ChaCha, specifically ChaCha12, which is fast enough for almost all practical purposes, has good statistical properties, is extremely difficult to predict (just as much as ChaCha20, but faster), and as a bonus supports jump-ahead (which is only relevant if we expose that capability, which we probably won't). Detailed thoughts below.


I regard being able to generate ~hundreds of megabytes per second on old hardware is probably sufficient, so among algorithms which can achieve that I'm inclined to choose based on other attributes rather than continuing to worry about performance. All mentioned algorithms are capable of that. ChaCha is the slowest, being at least an order of magnitude slower than the others.

Then the most important attribute I'd look for is being statistically well-distributed.

  • ChaCha is.
  • PCG almost certainly is, though the author of Xorshiro has expressed concerns about some details (if I'm reading it correctly, choosing similar seeds gets you similar outputs)
  • Xoshiro (at least some members of the family) have some issues written about e.g. here, here, and here.
  • SFC I couldn't find much analysis of, though it passes basic tests.
  • SplitMix64 people say is low quality, e.g. here, but I'm not sure why offhand - I do note it has a relatively low period and doesn't repeat outputs for its entire cycle, though when generating 64-bit numbers you need to generate billions of values before the absence of repeats starts to be surprising (more here).

Beyond that I'd optimize for being at least somewhat difficult to predict. People probably shouldn't be using seeded RNGs for cryptography, but it's still a useful property in non-cryptographic applications (e.g. games).

  • ChaCha is the only option we can attribute actual unpredictability to.
  • PCG has seen a relatively recent attack which recovers the internal state using ~20k hours of compute, though it isn't clear that this technique can be applied to the variant linked above.
  • Xoshiro is apparently trivially predictable (though this may not apply to all members of the family).

Other bits of context you may find helpful:

@tabatkins
Copy link
Collaborator

I haven't walked thru your references yet, but this seems like incredibly useful research. Thank you so much!

@Artoria2e5
Copy link

Artoria2e5 commented Mar 4, 2024

Go math/rand/v2 (https://pkg.go.dev/math/rand/v2) ended up settling with ChaCha8 as the default; it still has PCG as an option. golang/go#61716 is the discussion, I believe. The argument is that it makes misuse-as-cryptographic-randomness less of a big arrow in the knee; I doubt that applies here given we are outputting a number.

(Although getting not just [0,1) numbers out of the bag seems useful too: maybe someone wants a well-implemented dice roll. Tangentials, tangentials...)

@bakkot
Copy link
Author

bakkot commented May 7, 2024

More overview of Go's implementation here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants