Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semi-deterministic output even though randon_state is set #1108

Open
SleepyMorpheus opened this issue Apr 1, 2024 · 0 comments
Open

Semi-deterministic output even though randon_state is set #1108

SleepyMorpheus opened this issue Apr 1, 2024 · 0 comments

Comments

@SleepyMorpheus
Copy link

SleepyMorpheus commented Apr 1, 2024

Hello everybody,
While adding some tests to a project of mine, I noticed some really weird behaviour. Two different instances initialised with the same parameters (including random_state) output a different result for fit_transform during an execution. But when running the program again, the output does not change.

Am I missing something obvious? Or has anybody an idea why this is happening.
Thanks for looking into it.

Reproduction Steps

import umap

din = [[39.715797424316406, 5.328598499298096],
                      [40.119140625, 6.10653018951416],
                      [39.6290283203125, 6.134637832641602],
                      [39.19687271118164, 5.85951566696167],
                      [9.60939884185791, 9.586419105529785],
                      [-6.015710353851318, -11.25406265258789],
                      [9.012431144714355, 8.989534378051758],
                      [9.283456802368164, 9.261088371276855],
                      [-5.681527614593506, -10.919998168945312],
                      [-5.479494571685791, -10.71765422821045]]


a = umap.UMAP(random_state=42, n_neighbors=2, n_components=2).fit_transform(din).tolist()
b = umap.UMAP(random_state=42, n_neighbors=2, n_components=2).fit_transform(din).tolist()

print(a)
print(b)

assert a == b

with the output being:

/Users/op/.pyenv/versions/3.9.18/lib/python3.9/site-packages/umap/umap_.py:1945: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism.
  warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")
/Users/op/.pyenv/versions/3.9.18/lib/python3.9/site-packages/umap/umap_.py:1945: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism.
  warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")
[[20.164154052734375, 1.3494281768798828], [21.097431182861328, 0.2964009642601013], [20.777090072631836, 0.6684482097625732], [20.44692611694336, 1.0685573816299438], [11.67434310913086, 17.12160301208496], [-3.4501354694366455, 15.270648002624512], [11.07783031463623, 17.718942642211914], [11.350298881530762, 17.448848724365234], [-3.1171295642852783, 15.60583209991455], [-2.912529230117798, 15.80562973022461]]
[[4.337563514709473, 8.263677597045898], [3.3291709423065186, 7.28176212310791], [3.681276798248291, 7.623903751373291], [4.056679725646973, 7.981446743011475], [-2.7793023586273193, 16.567930221557617], [8.226690292358398, -1.6328837871551514], [-3.3761720657348633, 17.164905548095703], [-3.1042020320892334, 16.89451026916504], [7.8915557861328125, -1.2999401092529297], [7.691712379455566, -1.0954537391662598]]
Traceback (most recent call last):
  File "/Users/op/Documents/ETHZ/IVIA/umap-test/umap-lol.py", line 21, in <module>
    assert a == b
AssertionError

Versions

joblib==1.3.2
llvmlite==0.42.0
numba==0.59.1
numpy==1.26.4
pynndescent==0.5.12
scikit-learn==1.4.1.post1
scipy==1.12.0
threadpoolctl==3.4.0
tqdm==4.66.2
umap-learn==0.5.6

Note that umap is directly installed from github but behaviour stays the same if installed via pypi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant