Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-uniform distribution of randNumber generation #362

Open
Malien opened this issue Aug 10, 2023 · 2 comments
Open

Non-uniform distribution of randNumber generation #362

Malien opened this issue Aug 10, 2023 · 2 comments

Comments

@Malien
Copy link

Malien commented Aug 10, 2023

Is this a regression?

No

Description

Given the following snippet

import { randNumber } from "@ngneat/falso"

const freq = Array(10).fill(0)
for (const n of randNumber({ min: 0, max: 9, length: 10_000 }) {
  freq[n]++
}

we can see that the distribution is almost uniform (sample output; run-to-run variance is negligible):

[563, 1041, 1123, 1062, 1069, 1171, 1171, 1113, 1109, 578]

First and last entries are always encountered half as much. No matter, if the number is an integer, a float (via setting fraction parameter), and does not depend on the range size (min and max parameters), probabilities converge to
$$\left[\frac 1 {2 \cdot (S - 1) }, \frac 1 {S - 1}, \cdots, \frac 1 {S-1}, \frac 1 {2\cdot(S-1)} \right]$$
for a given amount of discrete values (S) with greater number of tests.

It is non-intuitive. This looks like a bug, in otherwise uniform-ish distribution.

Please provide a link to a minimal reproduction of the bug

No response

Please provide the exception or error you saw

No response

Please provide the environment you discovered this bug in

No response

Anything else?

Changing the algo will impact current "hard-seeded" outcomes, as such this might be a breaking change. I'm not saying that this is an important issue, just something I've noted. There are workarounds, such as using built-in (non-seedable) Math.random, using another lib, or using randNumber adding one more discrete value, and collapsing first and last outcomes into one.

Do you want to create a pull request?

No

@soanvig
Copy link

soanvig commented Aug 23, 2023

@Malien telling from this line:
https://github.com/ngneat/falso/blob/main/packages/falso/src/lib/core/core.ts#L64C10-L64C66

This is caused by the algorithm itself.
If you run the same test but for the number coming from:

Number((Math.random() * (9 - 0) + 0).toFixed(0))

you get identical result. If we go further and use round instead of toFixed:

Number(Math.round((Math.random() * (3 - 1) + 1)))

We again get identical result.

So I think to solve that issue this particular line needs to be changed.

@soanvig
Copy link

soanvig commented Aug 23, 2023

@Malien
Number(Math.floor((Math.random() * (10 - 0) + 0)) (floor and max + 1) produces proper uniform distribution.

This is compliant with MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random#getting_a_random_integer_between_two_values that tells us, that using Math.round() won't give you uniform distribution but Math.floor does.

This will fix generating for integer, however I'm not sure - yet - how to properly handle precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants