Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomNumericSequenceGenerator fails with OutOfMemory exception #789

Open
roend83 opened this issue Aug 2, 2017 · 8 comments · May be fixed by #928
Open

RandomNumericSequenceGenerator fails with OutOfMemory exception #789

roend83 opened this issue Aug 2, 2017 · 8 comments · May be fixed by #928
Labels

Comments

@roend83
Copy link

roend83 commented Aug 2, 2017

I'm getting an out of memory exception when creating a lot of large objects with AutoFixture. I've tracked the error back to the numbers HashSet in RandomNumericSequenceGenerator. It appears that this HashSet is used to prevent the same number from being used multiple times. My tests appear to be generating hundreds of millions of random numbers which caused this HashSet to go over the 2GB limit in .NET. Has anyone run into this before? I'd be happy to submit a PR to limit the size of the HashSet if that is desirable.

@zvirja zvirja added the bug label Aug 3, 2017
@zvirja
Copy link
Member

zvirja commented Aug 3, 2017

Thanks for your investigation and reporting the issue.

For me the root cause of the issue is that we track the already produced numbers and implement an exhaustive random. I don't know the reason of such decision, as I don't see a big issue with repetitions - it's usually OK for the random.

Also we could investigate whether it makes sense to import some third-party random that suits better for our needs.

I've noticed that random generator was created by @moodmosaic, so Nikos what do you think about the reason and a way to work around the issue?

I'd be happy to submit a PR to limit the size of the HashSet if that is desirable.

It's unlikely that it could be fixed via PR on AutoFixture side with current approach. Rather, you should add the gcAllowVeryLargeObjects node to your test's App.config file to allow creation of arrays larger than 2GB. This is, obviously, a temp workaround till we fix the issue.

@moodmosaic
Copy link
Member

It appears that this HashSet is used to prevent the same number from being used multiple times.

That's correct.

I'd be happy to submit a PR to limit the size of the HashSet if that is desirable.

To limit the size of the HashSet in which way? What would be a 'good' upper limit?

@zvirja
Copy link
Member

zvirja commented Aug 3, 2017

To limit the size of the HashSet in which way? What would be a 'good' upper limit?

Internally, MS uses prime numbers to expand HashSet when it's overflowed. Therefore, this threshold should be found using the empirical approach. I see issues with this approach:

  • this solution is bound to HashSet implementation that could change;
  • we will not be able to guarantee uniqueness for the very large sequences (greater than threshold).

I'd offer to review an alternative way to fix the issue. We could potentially use Linear congruential generator which produces non-repeating pseudo-random sequence. It's pretty simple and it offers much better performance. See a sample of the implementation here. It produces non-repeating sequence in int.MinValue-int.MaxValue range. However, it should be investigate how to make it work in range.

@moodmosaic
Copy link
Member

  • this solution is bound to HashSet implementation that could change

I don't see this as a problem. In fact, the whole implementation of that generator is an implementation detail.

@moodmosaic
Copy link
Member

  • we will not be able to guarantee uniqueness for the very large sequences (greater than threshold)

I don't see this as a problem either, unless we measure the quality of that generator with die hard, big crush, or other empirical randomness testing―something that ought to be done also in Hedgehog at some point.

I'd offer to review an alternative way to fix the issue. We could potentially use Linear congruential generator which produces non-repeating pseudo-random sequence.

This is a sound idea. FWIW, something similar is already done in Hedgehog, both Haskell and F# versions.

@zvirja
Copy link
Member

zvirja commented Aug 3, 2017

In fact, the whole implementation of that generator is an implementation detail.

Well, for me the current implementation doesn't seem to rely on internal detail of some third-party component, or I'm missing something.. After the fix we will depend on the prime numbers MS uses, which could silently change (e.g. in .NET Core) 😞

FWIW, something similar is already done in Hedgehog, both Haskell and F# versions.

It seems that you know this area a lot given that you worked with that 😄

I'd actually vote to follow this way as it looks more quicker and optimized than keeping a HashSet and having that do...while cycle. The only question will be how to adapt that so that we work on custom ranges..

@roend83
Copy link
Author

roend83 commented Aug 3, 2017

Rather, you should add the gcAllowVeryLargeObjects node to your test's App.config file to allow creation of arrays larger than 2GB.

I actually already tried this. It does get rid of the error, but it also makes the test unusably slow. I'm currently trying to batch the object creations and create a new Fixture instance for each batch. I'm still having trouble getting the garbage collector to clean up the old Fixture and it's references though.

@zvirja
Copy link
Member

zvirja commented Aug 3, 2017

@roend83 In this case please try the following approach:

var fixture = new Fixture();
fixture.Customizations.Add(new RandomNumericSequenceGenerator(1, byte.MaxValue, short.MaxValue));

That will limit random generator to max numeric values of Int16, which should not lead to the real issues.

@zvirja zvirja changed the title Out of Memory Error RandomNumericSequenceGenerator fails with OutOfMemory exception Aug 7, 2017
@zvirja zvirja added this to the v4 RTM milestone Oct 25, 2017
@zvirja zvirja linked a pull request Nov 19, 2017 that will close this issue
@zvirja zvirja removed this from the 4.0 RTM milestone Dec 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants