New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: remove xoshiro* BitGenerators #13793
Conversation
There are still a few references in the performance tables that I will remove if the direction is to accept this PR. |
xerf #13675 and the @rkern's comment there |
I have moved the code to numpy/bitgenerators. As of today it is not yet building, I hope to fix that over the next few days but that enables merging this PR as the code will not get lost. |
5440bd9
to
791b87a
Compare
I'm fine with dropping Xoshiro512. The extra state doesn't add anything important and it is slower. I am on the other side for Xoshiro256. It is a fast (fastest among the remaining ones) generator that has good performance across many platforms, and in particular, doesn't require uint128 or a custom implementation. It is able to pass tests under a wide range of specifications. |
Needs rebase. I'm not greatly opposed to keeping one of these, but only because it seems to be popular. The main considerations for dropping both would be: one, is it needed, and two, @rkern seems to distrust the series in general. It's history does look like a series of hacks to improve its statistical qualities. |
I would still prefer
|
JSF64, at least the one in the 2009 article, only has a 64 bit seed space
which may be too small to consider for sequenced generators.
…On Sat, Jun 22, 2019, 04:32 Robert Kern ***@***.***> wrote:
It is a fast (fastest among the remaining ones) generator that has good
performance across many platforms, and in particular, doesn't require
uint128 or a custom implementation.
I would still prefer JSF64 to satisfy those requirements. In my tests
(64-bit Linux only, I'm afraid), it had the same performance as Xoshiro256
.
❯ python benchmark.py
--------------------------------------------------------------------------------
Time to produce 1,000,000 Uniforms
************************************************************
JSF64 3.05 ms
MT19937 8.51 ms
PCG32 4.63 ms
PCG64 4.52 ms
Philox 8.65 ms
ThreeFry 10.37 ms
Xoshiro256 3.07 ms
numpy 8.34 ms
dtype: object
Uniforms per second
************************************************************
JSF64 327.89 million
MT19937 117.45 million
PCG32 216.09 million
PCG64 221.37 million
Philox 115.57 million
ThreeFry 96.41 million
Xoshiro256 326.01 million
numpy 119.90 million
dtype: object
Speed-up relative to NumPy
************************************************************
JSF64 173.5%
MT19937 -2.0%
PCG32 80.2%
PCG64 84.6%
Philox -3.6%
ThreeFry -19.6%
Xoshiro256 171.9%
dtype: object
--------------------------------------------------------------------------------
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#13793?email_source=notifications&email_token=ABKTSRKICB74SV7PFODRRH3P3WMOHA5CNFSM4HYQ5TMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYJ7DOI#issuecomment-504623545>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABKTSRJRCS7VGNGP5PTVF33P3WMOHANCNFSM4HYQ5TMA>
.
|
You don't have to seed it exactly as in the article; C is very limiting. I implemented a 192-bit seed. |
The challenge of seeding using a larger space is that less is known about
the short cycle properties of it. The 32 bit has been tried a good bit and
doesn’t seem to have any when seeded using 32 bits. There are appears to
be some very short cycles possible, and so it seems risky to come up with a
second ad hoc seeding scheme.
…On Sat, 22 Jun 2019 at 18:16 Robert Kern ***@***.***> wrote:
You don't have to seed it exactly as in the article; C is very limiting. I
implemented a 196-bit seed. <mattip#41>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#13793?email_source=notifications&email_token=ABKTSRLO5CHFKFCAGQPKI7LP3ZM73A5CNFSM4HYQ5TMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYKNVSQ#issuecomment-504683210>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABKTSRLHGN7SGBTS2V5QQEDP3ZM73ANCNFSM4HYQ5TMA>
.
|
The single-number seeding But if it still worries you, then I also provide the related |
SFC is certainly attractive since it provides guarantees, which are always desirable, and 2**64 is enough If it had a jump/advance then I think it would dominate. I do believe that first-class citizens should all provide this alternative (and widely used) method for generating distinct sequences. |
I added timings in the other thread. Based on these, a jumpable SFC64 might be the best choice for a default :-). |
4e34710
to
aa6a476
Compare
Just to be clear, I don't think we have any evidence (except for a copy-paste-o comment from me) that SFC is jumpable. But jumping isn't a feature most people need. |
I will rebase this after #13833 is merged |
#13833 is merged. |
rebased, tests passing |
Thanks Matti. |
I was just pointed at this discussion by a user. Three points that might be of interest:
rust-random/rand#905 If you're not using the "streams" features, of course, you're free from that type of correlation. But, still, the orbit of a PCG generator has tons of pairs of correlated nonoverlapping subsequences (see the examples pointed above). This does not happen with a good linear generator (even better, if scrambled), or in fact, with any good generator. The problem is that LCG with power-of-2 modulus are terrible generators, and the superficial scrambling performed by most PCG generators does not hide this fact. |
In order to ground the discussion about which BitGenerator to include/remove, this PR suggests removing Xoshori512 and Xoshiro256 and change the default to PCG64. Quoting @rkern (hopefully not taken too much out of context):
Any other thoughts?