Semi-deterministic output even though randon_state is set #1108

SleepyMorpheus · 2024-04-01T22:23:27Z

Hello everybody,
While adding some tests to a project of mine, I noticed some really weird behaviour. Two different instances initialised with the same parameters (including random_state) output a different result for fit_transform during an execution. But when running the program again, the output does not change.

Am I missing something obvious? Or has anybody an idea why this is happening.
Thanks for looking into it.

Reproduction Steps

import umap

din = [[39.715797424316406, 5.328598499298096],
                      [40.119140625, 6.10653018951416],
                      [39.6290283203125, 6.134637832641602],
                      [39.19687271118164, 5.85951566696167],
                      [9.60939884185791, 9.586419105529785],
                      [-6.015710353851318, -11.25406265258789],
                      [9.012431144714355, 8.989534378051758],
                      [9.283456802368164, 9.261088371276855],
                      [-5.681527614593506, -10.919998168945312],
                      [-5.479494571685791, -10.71765422821045]]


a = umap.UMAP(random_state=42, n_neighbors=2, n_components=2).fit_transform(din).tolist()
b = umap.UMAP(random_state=42, n_neighbors=2, n_components=2).fit_transform(din).tolist()

print(a)
print(b)

assert a == b

with the output being:

/Users/op/.pyenv/versions/3.9.18/lib/python3.9/site-packages/umap/umap_.py:1945: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism.
  warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")
/Users/op/.pyenv/versions/3.9.18/lib/python3.9/site-packages/umap/umap_.py:1945: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism.
  warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")
[[20.164154052734375, 1.3494281768798828], [21.097431182861328, 0.2964009642601013], [20.777090072631836, 0.6684482097625732], [20.44692611694336, 1.0685573816299438], [11.67434310913086, 17.12160301208496], [-3.4501354694366455, 15.270648002624512], [11.07783031463623, 17.718942642211914], [11.350298881530762, 17.448848724365234], [-3.1171295642852783, 15.60583209991455], [-2.912529230117798, 15.80562973022461]]
[[4.337563514709473, 8.263677597045898], [3.3291709423065186, 7.28176212310791], [3.681276798248291, 7.623903751373291], [4.056679725646973, 7.981446743011475], [-2.7793023586273193, 16.567930221557617], [8.226690292358398, -1.6328837871551514], [-3.3761720657348633, 17.164905548095703], [-3.1042020320892334, 16.89451026916504], [7.8915557861328125, -1.2999401092529297], [7.691712379455566, -1.0954537391662598]]
Traceback (most recent call last):
  File "/Users/op/Documents/ETHZ/IVIA/umap-test/umap-lol.py", line 21, in <module>
    assert a == b
AssertionError

Versions

joblib==1.3.2
llvmlite==0.42.0
numba==0.59.1
numpy==1.26.4
pynndescent==0.5.12
scikit-learn==1.4.1.post1
scipy==1.12.0
threadpoolctl==3.4.0
tqdm==4.66.2
umap-learn==0.5.6

Note that umap is directly installed from github but behaviour stays the same if installed via pypi.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semi-deterministic output even though randon_state is set #1108

Semi-deterministic output even though randon_state is set #1108

SleepyMorpheus commented Apr 1, 2024 •

edited

Semi-deterministic output even though randon_state is set #1108

Semi-deterministic output even though randon_state is set #1108

Comments

SleepyMorpheus commented Apr 1, 2024 • edited

Reproduction Steps

Versions

SleepyMorpheus commented Apr 1, 2024 •

edited