Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug #704 fix - randomu out of range with large poisson seed #1041

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

eloirozier
Copy link
Contributor

I naively replaced some int variables with long ones, and modified two functions from the gsl library. Maybe this is too naive ?
At least it seemingly doesn't affect the computation speed

@alaingdl
Copy link
Contributor

alaingdl commented Aug 3, 2021

We do need to make very serious tests before accepting such patch,
because frequently RAND codes do use tricky ways to work

And we should not introduce bias in the initial range (where the code was tested)

@eloirozier
Copy link
Contributor Author

Ok thanks. Then we throw a warning to the user ?

@alaingdl
Copy link
Contributor

alaingdl commented Aug 3, 2021

Thanks
Such change will be very welcome but we need few time to check carefully such important changes which may give unexpected side effects. Did you run the test cases already available
Did you notice you can switch the generator type at startup ? (gdl --no-dSFMT)
(then the tests should be run twice)
And maybe you can add the cases I mentioned as new test case ?!

@eloirozier
Copy link
Contributor Author

I added a new test case in test_random_poisson (in file testsuite/interactive_tests/test_random.pro). Everything seems to work well, with and without dSFMT

@GillesDuvert
Copy link
Contributor

for the record, dSFMT and gsl use the same algorithm, that give the same sequence of values (integers, converted to floats or doubles using the same trick as IDL does) for any same 'seed'. And this series of values are the same as IDL's. IDL previously used a different algorithm, less "random", still available with the option /RAN1 .

The only drawback of dSFMT is that the (SIMD accelerated) random numbers do not arrive in the same sequence as in gsl . So the comparisons are not possible. Only the gsl-produced random numbers are strictly identical as IDL's. But dSFMT is many times faster than gsl (gsl algorithms are no-nonsense and not made for speed)

I (probably unfortuately) used this same /RAN1 option not to get the old iDL values but to switch from dSFMT to gsl directly in the command line, not with the --no-dsFMT option. This to enable result comparison in a test procedure without the need to start gdl with the option (think about the testsuite).

@alaingdl
Copy link
Contributor

alaingdl commented Aug 3, 2021

I need few time to reread/rerun the code of test_random.pro but, if I remember well, most of those procedures are not interactive, they do make some real tests inside. I think they should come back to the testsuite/ dir.

It is of higher importance for GDL not to have errors or bias in RANDOM related codes because it would affect our credibility. We did have errors two times in the past 😞 and the reason they survive is we did not anticipated wide enough tests. If we are not able to compute in a given range, we must put a Throw() !

@GillesDuvert
Copy link
Contributor

I agree totally.

@eloirozier
Copy link
Contributor Author

Then if my understanding is right, to get the same sequence of numbers from both GDL and IDL with the same command, I should add the /RAN1 option in GDL ?

Because i checked that:
GDL> randomu(1234, 20, /DOUBLE, /RAN1)
gives the same result as
IDL> randomu(1234, 20, /DOUBLE)

But with the POISSON keyword set, this is not true at all:

GDL> print, randomu(1234, 5, POISSON=18, /DOUBLE, /RAN1)
       15.000000       23.000000       18.000000       20.000000       18.000000

IDL> print, randomu(1234, 5, POISSON=18, /DOUBLE)       
       22.000000       13.000000       17.000000       18.000000       15.000000

I should point out that the modifications I added do not affect the results below the upper limit that was found in #704 by @alaingdl, they just extend it to be a lot higher (I tested, it is around 1.5e+19 which corresponds roughly to the Max value of an unsigned long long), so in any case we should Throw() above a certain value, a thing which IDL does not do:

IDL> print, randomu(1234, 10, POISSON=1.9e19, /DOUBLE)
   1.9000001e+19   1.9000001e+19   1.9000001e+19   1.9000001e+19   1.9000001e+19
   1.9000001e+19   1.9000001e+19   1.9000001e+19   1.9000000e+19   1.9000001e+19
% Program caused arithmetic error: Floating overflow _--- IDL indicates an error, GDL simply crashes at this point_
IDL> print, randomu(1234, 10, POISSON=2e35, /DOUBLE)
   2.0000001e+35   2.0000001e+35   2.0000001e+35   2.0000001e+35   2.0000001e+35
   2.0000001e+35   2.0000001e+35   2.0000001e+35   2.0000001e+35   2.0000001e+35
IDL> print, randomu(1234, 10, POISSON=2e40, /DOUBLE)
 _--- Here IDL crashes !!! ---_

@GillesDuvert
Copy link
Contributor

yes the sameness of sequence is only true for the basic blocks of 'ran()', integers

IDL> randomu(33,10,/long)        
   533671434  1481082051   966314860  1437270305   882488649  1977169508   558989340   236098657  1869160499  1612596254
GDL> randomu(33,10,/long)
  1816241703   560431044    42090558   536937638  1641545619  1041214398  1309936456  1937139015  1586408490   157581916
GDL> randomu(33,10,/long,/ran1)
% When using the RAN1 mode, be sure to keep the RAN1 and dSFMT seed arrays in separate variables.
   533671434  1481082051   966314860  1437270304   882488649  1977169507   558989340   236098658  1869160498  1612596253

And for the derived floats and doubles, as the converter integer->float and double insures it.
So randomu and randomn should give the same results.

However all the derivative 'flavors' of randomness (Poisson etc) have not got the same kind of close attention and results may differ.

@GillesDuvert
Copy link
Contributor

I remind: tests confirmation needed, see comment #1041 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants