Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

t-SNE fails with array must not contain infs or NaNs (OSX specific) #6665

Closed
joelkuiper opened this issue Apr 15, 2016 · 108 comments
Closed

t-SNE fails with array must not contain infs or NaNs (OSX specific) #6665

joelkuiper opened this issue Apr 15, 2016 · 108 comments
Labels
Milestone

Comments

@joelkuiper
Copy link

joelkuiper commented Apr 15, 2016

Darwin-15.0.0-x86_64-i386-64bit
('Python', '2.7.11 |Anaconda custom (x86_64)| (default, Dec  6 2015, 18:57:58) \n[GCC 4.2.1 (Apple Inc. build 5577)]')
('NumPy', '1.11.0')
('SciPy', '0.17.0')
('Scikit-Learn', '0.17.1')

When trying to run a t-SNE

proj = TSNE().fit_transform(X)
ValueError: array must not contain infs or NaNs

However

np.isfinite(X).all() # True 
np.isnan(X).all() # False
np.isinf(X).all() # False

Full Stack Trace:


ValueError                                Traceback (most recent call last)
<ipython-input-16-c25f35fd042c> in <module>()
----> 1 plot(X, y)

<ipython-input-1-72bdb7124d13> in plot(X, y)
     74 
     75 def plot(X, y):
---> 76     proj = TSNE().fit_transform(X)
     77     scatter(proj, y)

/Users/joelkuiper/anaconda/lib/python2.7/site-packages/sklearn/manifold/t_sne.pyc in fit_transform(self, X, y)
    864             Embedding of the training data in low-dimensional space.
    865         """
--> 866         embedding = self._fit(X)
    867         self.embedding_ = embedding
    868         return self.embedding_

/Users/joelkuiper/anaconda/lib/python2.7/site-packages/sklearn/manifold/t_sne.pyc in _fit(self, X, skip_num_points)
    775                           X_embedded=X_embedded,
    776                           neighbors=neighbors_nn,
--> 777                           skip_num_points=skip_num_points)
    778 
    779     def _tsne(self, P, degrees_of_freedom, n_samples, random_state,

/Users/joelkuiper/anaconda/lib/python2.7/site-packages/sklearn/manifold/t_sne.pyc in _tsne(self, P, degrees_of_freedom, n_samples, random_state, X_embedded, neighbors, skip_num_points)
    830         opt_args['momentum'] = 0.8
    831         opt_args['it'] = it + 1
--> 832         params, error, it = _gradient_descent(obj_func, params, **opt_args)
    833         if self.verbose:
    834             print("[t-SNE] Error after %d iterations with early "

/Users/joelkuiper/anaconda/lib/python2.7/site-packages/sklearn/manifold/t_sne.pyc in _gradient_descent(objective, p0, it, n_iter, objective_error, n_iter_check, n_iter_without_progress, momentum, learning_rate, min_gain, min_grad_norm, min_error_diff, verbose, args, kwargs)
    385     for i in range(it, n_iter):
    386         new_error, grad = objective(p, *args, **kwargs)
--> 387         grad_norm = linalg.norm(grad)
    388 
    389         inc = update * grad >= 0.0

/Users/joelkuiper/anaconda/lib/python2.7/site-packages/scipy/linalg/misc.pyc in norm(a, ord, axis, keepdims)
    127     """
    128     # Differs from numpy only in non-finite handling and the use of blas.
--> 129     a = np.asarray_chkfinite(a)
    130 
    131     # Only use optimized norms if axis and keepdims are not specified.

/Users/joelkuiper/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in asarray_chkfinite(a, dtype, order)
   1020     if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
   1021         raise ValueError(
-> 1022             "array must not contain infs or NaNs")
   1023     return a
   1024 

ValueError: array must not contain infs or NaNs
@joelkuiper joelkuiper changed the title T-SNE fails with array must not contain infs or NaNs t-SNE fails with array must not contain infs or NaNs Apr 15, 2016
@joelkuiper
Copy link
Author

Same with ('Scikit-Learn', '0.18.dev0')

@KeyKy
Copy link

KeyKy commented Apr 17, 2016

Do you mind sharing your data X with me?

@joelkuiper
Copy link
Author

Sure, where and in what format would you like it?

On 17 Apr 2016, at 09:11, 康洋 notifications@github.com wrote:

Do you mind sharing your data X with me?


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub #6665 (comment)

@KeyKy
Copy link

KeyKy commented Apr 17, 2016

My email is 370846270@qq.com
As i know, there is a function numpy.save for saving an array to a binary file in .npy format~~

@KeyKy
Copy link

KeyKy commented Apr 18, 2016

I test your data in ubuntu 14.04 LTS with
Python==2.7.6
scikit-learn==0.17.1
numpy==1.8.2
scipy==0.13.3
It is fine and doesn't raise the ValueError. The test code is:
`import numpy
a = numpy.load('/root/test.npy')
print a.shape
print numpy.isnan(a).all() #False
print numpy.isfinite(a).all() #True
print numpy.isinf(a).all() #False

from sklearn.manifold import TSNE
proj = TSNE().fit_transform(a) #[[ 2.35503527e+00 1.15976751e+01] .... [ 3.29832591e+00 8.98212513e+00]]
print proj`


Then i upgrade numpy, scipy to 1.11.0, 0.17.0 and test with the same code and it also doesn't raise any error.

@ivan-krukov
Copy link

ivan-krukov commented May 11, 2016

Reproduced for 3.5 with anaconda under OS X El Capitan.

Darwin 15.4.0
Python 3.5.1 :: Anaconda custom (x86_64)
numpy 1.10.4
scipy 0.17.0
scikit-learn 0.17.1

Example run:

import random
from sklearn.manifold import TSNE
random.seed(1)
a = np.random.uniform(size=(100,20))
TSNE(n_components=2).fit_transform(a)

@jnothman
Copy link
Member

Thanks @ivan-krukov, but I'm failing to replicate in Python 3.3. Will try 3.5

@ivan-krukov
Copy link

This does not apply to linux (4.4.0-21, Ubuntu 16.04) with the same packages under 3.5.

@jnothman
Copy link
Member

I'm on El-Capitan, but I'm failing to get a Python 3.5 installation up and running.

@dcbb
Copy link

dcbb commented Jun 1, 2016

Is there any update on this?

I have the issue on a dataset of mine, on Anaconda, Py 3.5, sklearn 0.17.1, OSX El Capitan.
I can reproduce the error with the example provided by @ivan-krukov .

@youyanggu
Copy link

Same issue. Python 2.7.6 on OS X El Capitan on 0.17. Tried the same code on Linux using Python 2.7.6 and 0.17, and it works.

@edevil
Copy link

edevil commented Jun 8, 2016

Same issue.
OSX El Capitan Python 3.5.1
scikit-learn==0.17.1
scipy==0.17.1

@Ekliptor
Copy link

Ekliptor commented Jun 13, 2016

I have the same problem and would really appreciate a fix (or workaround?)
System Version: OS X 10.11.5
Python 3.5.1 :: Anaconda 4.0.0 (x86_64)
numpy.version.version 1.11.0
scipy.version 0.17.1
sklearn.version 0.17.1

I can also reproduce the bug with the code sample from ivan-krukov

@lucienevans
Copy link

Same issue on OS X EI Capitan using Python 3.5

@jnothman jnothman added the Bug label Jun 16, 2016
@Concomitant
Copy link

System Version: OS X 10.11.5
Python 3.5.1 :: Continuum Analytics, Inc.
numpy.version 1.11.1
scipy.version 0.16.0
sklearn.version 0.17.1

Same problem. Though I have noticed that it only occurs for a subset of my dataset and not with the whole thing. That is, if I do TSNE on the whole data set it works, if I do it on a reduced set it does not.

@Concomitant
Copy link

O_o;; This just in, if I repeat the same 'broken' subset that doesn't work(by means of list*10) then it works. Multiplying each individual vector by 10 doesn't work, but duplicating the date does. just doubling the length of the list is insufficient. Maybe this is some kind of degrees of freedom check run amok?

@lesteve
Copy link
Member

lesteve commented Jun 30, 2016

@ivan-krukov I bit the bullet today and installed an El Capitan VM. Unfortunately I can not reproduce your problem.

@Concomitant can you reproduce the error on the stand-alone example given in #6665 (comment)?

@lesteve
Copy link
Member

lesteve commented Jun 30, 2016

I'm on El-Capitan, but I'm failing to get a Python 3.5 installation up and running.

@jnothman it doesn't seem to be happening only on Python 3.5 so if you could try to reproduce with Python 2.7 (snippet: #6665 (comment)) that would be great.

@Concomitant
Copy link

@lesteve I can reproduce the issue.

import numpy as np
import random
from sklearn.manifold import TSNE
random.seed(1)
a = np.random.uniform(size=(100,20))
TSNE(n_components=2).fit_transform(a)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dshank/miniconda3/envs/python3/lib/python3.5/site-packages/sklearn/manifold/t_sne.py", line 866, in fit_transform
    embedding = self._fit(X)
  File "/Users/dshank/miniconda3/envs/python3/lib/python3.5/site-packages/sklearn/manifold/t_sne.py", line 777, in _fit
    skip_num_points=skip_num_points)
  File "/Users/dshank/miniconda3/envs/python3/lib/python3.5/site-packages/sklearn/manifold/t_sne.py", line 832, in _tsne
    params, error, it = _gradient_descent(obj_func, params, **opt_args)
  File "/Users/dshank/miniconda3/envs/python3/lib/python3.5/site-packages/sklearn/manifold/t_sne.py", line 387, in _gradient_descent
    grad_norm = linalg.norm(grad)
  File "/Users/dshank/miniconda3/envs/python3/lib/python3.5/site-packages/scipy/linalg/misc.py", line 115, in norm
    a = np.asarray_chkfinite(a)
  File "/Users/dshank/miniconda3/envs/python3/lib/python3.5/site-packages/numpy/lib/function_base.py", line 1033, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

Following the same code, however:

>>> a = np.random.uniform(size=(10000,20))
>>> TSNE(n_components=2).fit_transform(a)
array([[  3.25766047e+11,  -2.74708004e+11],
       [  2.43498802e+11,  -7.68189047e+10],
       [ -6.00107639e+09,  -1.13548763e+11],
       ..., 
       [  3.02794039e+10,   6.64402020e+11],
       [  2.55855781e+10,   5.67932400e+10],
       [  1.42040378e+11,  -7.55188994e+10]])

Bizarre.

@ogrisel
Copy link
Member

ogrisel commented Jul 5, 2016

I cannot reproduce either with python 3.5.1, numpy 1.11.1, scipy 0.17.1 and scikit-learn 0.17.1 from miniconda (with MKL) on a virtualbox with OSX El Capitan. I will try on a real mac hardware later.

@ogrisel
Copy link
Member

ogrisel commented Jul 5, 2016

Also @joelkuiper and @Concomitant can you please check that you can reproduce the problem on the current state of the scikit-learn master branch?

@nelson-liu
Copy link
Contributor

nelson-liu commented Jul 5, 2016

@lesteve and others I cannot reproduce the error with the snippet posted earlier on the latest master with python 2.7.

System info:

Darwin-15.0.0-x86_64-i386-64bit
('Python', '2.7.10 (v2.7.10:15c95b7d81dc, May 23 2015, 09:33:12) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]')
('NumPy', '1.11.0')
('SciPy', '0.17.0')
('Scikit-Learn', '0.18.dev0')

@ogrisel
Copy link
Member

ogrisel commented Jul 6, 2016

I tried again on a real mac running OSX El Capitan 10.11.3 (with anaconda's latest numpy scipy and scikit-learn, same setting as reported by @Concomitant in #6665 (comment)) but could not reproduce the problem either (tried running the snippet several times).

What is weird though it that the despite the np.random.seed(1) line I get different results for the output of fit_transform. This might be a bug in itself.

@ogrisel
Copy link
Member

ogrisel commented Jul 6, 2016

What is weird though it that the despite the np.random.seed(1) line I get different results for the output of fit_transform. This might be a bug in itself.

Actually I read @Concomitant's code snippet too quickly: instead of random.seed(1) it should be np.random.seed(1) otherwise the numpy RNG is not reseeded appropriately and one cannot get deterministic results.

@ogrisel
Copy link
Member

ogrisel commented Jul 6, 2016

Also I now realized that I read the whole discussion too quickly and that the bug only happens with python 2.7. Will try again.

@ogrisel
Copy link
Member

ogrisel commented Jul 6, 2016

I cannot reproduce either with python 2.7.12 from conda on OSX 10.11.3 either.

Actually @Ekliptor can reproduce the issue with python 3.5.1 from conda so it's probably not related to the version of Python either. Maybe it depends on the minor version of OSX. Will upgrade and retry.

@ogrisel
Copy link
Member

ogrisel commented Jul 6, 2016

I cannot replicate either with OSX 10.11.5. I tried both with Python 2.7.12 and 3.5.2 installed with conda along with numpy 1.11.1, scipy 0.17.1 and scikit-learn 0.17.1.

I don't know what to do. If one of you can reproduce the problem, please try to find a numpy random seed that trigger the issue (using np.random.seed(my_seed) instead of random.seed(1) in the above snippet) and communicate the value here (along with the version of OSX and you python packages).

@Ekliptor
Copy link

Ekliptor commented Jul 11, 2016

I can confirm the issue is fixed with the latest version. I can not reproduce it anymore as before.
I only updated numpy:
numpy.version.version 1.11.1

To all people working with Tensorflow I can add:
When I try to plot a very small sample (< 200 points) I sometimes still run into this error. After increasing the sample size I pass into tsne.fit_transform() it always works.

@lesteve
Copy link
Member

lesteve commented Dec 1, 2016

For anyone affected by this, this should fix it:

conda remove numpy --force -y
pip uninstall numpy -y
conda install numpy

Let me know if that doesn't work for you.

@amueller
Copy link
Member

amueller commented Dec 1, 2016

Thanks for the deep dive (again!) @lesteve

@lesteve
Copy link
Member

lesteve commented Dec 2, 2016

I thought we would never get to the bottom of this one to be honest :) ! OK it's not quite the bottom but it's low enough as far as I am concerned.

I have to admit I would still like to understand what's happening within the numpy installed with both pip and conda ...

@BerenLuthien
Copy link

BerenLuthien commented Feb 24, 2017

Hi
I tried two setups, where

  • TSNE works well with one setup (where Tensorflow is de-activated, Python-3.x), however,

  • TSNE does not work with the other setup (where Tensorflow is activated, Python 2.x).

The set up where TSNE works well:

Terminal:

Macbook:~ BG$ which jupyter
/Users/BG/anaconda/bin/jupyter

Jupyer notebook:

import sys
print (sys.version)
3.5.2 |Anaconda 4.2.0 (x86_64)| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]

Note: I tried

conda remove numpy --force -y
pip uninstall numpy -y
conda install numpy

to make TSNE work well with Tensorflow deactivated.
However, with the new setup below (where I have to use Tensorflow), this does not work any more.
——-———-———-———-———-———-———-———-

The set up where TSNE does not work:

Terminal:

Macbook:~$ source activate tensorflow
(tensorflow) Macbook:~$ which jupyter
/Users//anaconda/envs/tensorflow/bin/jupyter
(tensorflow) Macbook:~$ 

Jupyer notebook:

import sys
print (sys.version)
2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:05:08) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]

Error:
ValueError: array must not contain infs or NaNs

Any suggestions ? Thanks a lot

@rasbt
Copy link
Contributor

rasbt commented Feb 24, 2017

Interesting. I think it has nothing to do with tensorflow; my guess is that

[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]

vs

[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]

is the culprit!?

@BerenLuthien
Copy link

BerenLuthien commented Feb 24, 2017

Thanks for response :) Any suggested solutions/to_do_list ?

Need use both
Tensorflow and
TSNE
in Jupyter notebook ....

BTW: just tried "from future import division" in Python 2.x and did not solve the problem.

@rasbt
Copy link
Contributor

rasbt commented Feb 24, 2017

Hm, not sure if that helps -- personally, I am not getting this mysterious issue anymore with

Python 3.5.3 |Continuum Analytics, Inc.| (default, Feb 22 2017, 20:51:01) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin

I am on Tf (now 1.0) as well, and I don't have this Error: ValueError: array must not contain infs or NaNs issue anymorewhen I execute

import numpy as np
from sklearn.manifold import TSNE

np.random.seed(1)

a = np.random.uniform(size=(100, 20))
TSNE(n_components=2, random_state=1).fit_transform(a)

which previously didn't work.

Maybe try to create a new python 3.5 env and try the above-mentioned snippet to see if it works without error:

conda create -n yourenv python=3.5 numpy scipy scikit-learn
source activate yourenv
pip install tensorflow(-gpu)

@BerenLuthien
Copy link

Hi rasbt,
Yes I made TSNE work on Python 3.5.
However, for some other reason I'd better use Python 2.7, so I have to continue to explore ... cross fingers

Thanks for your help.

@rasbt
Copy link
Contributor

rasbt commented Feb 24, 2017

Do you have an old(er) Miniconda/Anaconda 2.7 distro installed? In this case, maybe consider installing one of the more recent ones, or update your conda root or default python and give it another try (or create a new py 27 env by substituting the 3.5 by 2.7 in conda create -n yourenv python=3.5 numpy scipy scikit-learn) ? (not sure if this is really the reason, but I think LLVM 4.2 (clang-425.0.28) may be an issue; since the error doesn't seem to occur via [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)])

@BerenLuthien
Copy link

Update: TSNE(perplexity=30, n_components=2, init='pca', n_iter=1000, method='exact') make it worked ...
method='exact' was the trick.

@bglick13
Copy link

Also been having this problem. Using method='exact' seems to works for me, but it is so painfully slow. Is there really no other solution that people have found?

@lesteve
Copy link
Member

lesteve commented Feb 28, 2017

Have you read #6665 (comment) and #6665 (comment) ?

The only way I managed to reproduce this problem was to install numpy with both pip and conda in the same conda environment. If you create a conda environment from scratch you should not have this problem.

In case your problem do not seem to match this description, please post the exact commands you ran to create your conda environment, so we can try to reproduce.

@jsevo
Copy link

jsevo commented May 11, 2017

Hi,
I read the above comments and can reproduce this. I re-ran code from a few weeks ago and now this issue appears. Here's a minimal example that now reproduces this issue:

from sklearn.manifold import TSNE
a = [[1,2,3],[4,5,6], [7,8,9]]
TSNE(n_components=2,).fit_transform(a)

And the output of

import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)

is

Darwin-16.5.0-x86_64-i386-64bit
Python 3.6.0 |Anaconda 4.3.0 (x86_64)| (default, Dec 23 2016, 13:19:00) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
NumPy 1.12.1
SciPy 0.19.0
Scikit-Learn 0.18.1

Again, changing the method to exact (TSNE(method='exact')) gets rid of the error.

More generally, I have noticed wildly different results when using sklearn's TSNE (with identitical perplexity and other parameters) from the bh implementation published by Laurens van der Maaten and the MATLAB version. I wonder if there may be a connection?

@glemaitre
Copy link
Member

Did you refer to #6665 (comment)

@jsevo
Copy link

jsevo commented May 12, 2017

That fixed it. My apologies - I had separately uninstalled an reinstalled numpy, scikit learn and scipy, but not like in 6665.

@OptimusCrime
Copy link

OptimusCrime commented May 21, 2017

I had the same problem as reported here, and I do not use conda.

My Python version is installed via brew on macOS Sierra 10.12.4

Python 3.6.1
scipy==0.19.0
scikit-learn==0.18.1
numpy==1.11.1

Adding mode='exact' solved my problem.

@lesteve lesteve changed the title t-SNE fails with array must not contain infs or NaNs (MKL specific and likely CPU specific) t-SNE fails with array must not contain infs or NaNs (OSX specific) May 22, 2017
@bbartoldson
Copy link

@lesteve: i had this error using the setup you describe (two versions of numpy installed). simply updating the conda install of numpy to the same version as the pip install (1.12.1) did the trick for me. i did remove the pip numpy install, though, as i didn't intend to have two versions :)

@walkon302
Copy link

@lesteve: Thank you for the solution! I happened to have this error and then I found this discussion. Fix it right away after remove the duplicated version of numpy.

@wolfiex
Copy link

wolfiex commented Aug 15, 2017

Replicated I have removed pip installs of numpy and updated conda.

Darwin-16.7.0-x86_64-i386-64bit
('Python', '2.7.13 |Anaconda custom (x86_64)| (default, Dec 20 2016, 23:05:08) \n[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]')
('NumPy', '1.13.1')
('SciPy', '0.19.0')
('Scikit-Learn', '0.18.1')

It seems fine on my linux machine Linux:
Linux-3.0.101-0.47.71-default-x86_64-with-SuSE-11-x86_64
('Python', '2.7.12 |Anaconda 2.3.0 (64-bit)| (default, Jul 2 2016, 17:42:40) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]')
('NumPy', '1.12.1')
('SciPy', '0.19.1')
('Scikit-Learn', '0.18.1')

@amueller
Copy link
Member

@wolfiex so you did

conda remove numpy --force -y
pip uninstall numpy -y
conda install numpy

Somewhat related I recommend you update to scikit-learn 0.19 which has some fixes in t-SNE

@rahulsnair
Copy link

getting the same error now

@cmarmo
Copy link
Member

cmarmo commented Oct 1, 2020

Hi @rahulsnair , do you mind opening a new issue, with reproducible code, your traceback and the versions you are using? This issue is pretty old and the code has changed a lot. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests