Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting a Type Error while using np.unique with array of dtype=object #15199

Closed
Jerevia opened this issue Dec 30, 2019 · 4 comments
Closed

Getting a Type Error while using np.unique with array of dtype=object #15199

Jerevia opened this issue Dec 30, 2019 · 4 comments

Comments

@Jerevia
Copy link

Jerevia commented Dec 30, 2019

Getting a TypeError while using np.unique with array of dtype=object

Reproducing code example:

import numpy as np

a = np.array([1, 2, 3, 2, 'a', 'b', 'a'], dtype=object)
np.unique(a)

Error message:

----> 1 np.unique(a)

<__array_function__ internals> in unique(*args, **kwargs)

~/miniconda3/lib/python3.7/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
    260     ar = np.asanyarray(ar)
    261     if axis is None:
--> 262         ret = _unique1d(ar, return_index, return_inverse, return_counts)
    263         return _unpack_tuple(ret)
    264 

~/miniconda3/lib/python3.7/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
    308         aux = ar[perm]
    309     else:
--> 310         ar.sort()
    311         aux = ar
    312     mask = np.empty(aux.shape, dtype=np.bool_)

TypeError: '<' not supported between instances of 'str' and 'int'

Numpy/Python version information:

1.18.0 3.7.5 (default, Oct 25 2019, 10:52:18)
[Clang 4.0.1 (tags/RELEASE_401/final)]

@subhrm
Copy link
Contributor

subhrm commented Dec 30, 2019

It worked fine for me !

>>> a = np.array([1, 2, 3, 2, 'a', 'b', 'a'])
>>> np.unique(a)
array(['1', '2', '3', 'a', 'b'], dtype='<U21')
>>> np.__version__
'1.18.0'
>>> import sys
>>> sys.version
'3.7.5 (default, Nov  7 2019, 10:50:52) \n[GCC 8.3.0]'

@Jerevia
Copy link
Author

Jerevia commented Dec 31, 2019

It worked fine for me !

>>> a = np.array([1, 2, 3, 2, 'a', 'b', 'a'])
>>> np.unique(a)
array(['1', '2', '3', 'a', 'b'], dtype='<U21')
>>> np.__version__
'1.18.0'
>>> import sys
>>> sys.version
'3.7.5 (default, Nov  7 2019, 10:50:52) \n[GCC 8.3.0]'

Sorry for this careless, I forgot to add the dtype=object argument, already updated now.

@mattip
Copy link
Member

mattip commented Dec 31, 2019

Since the docstring of unique states "Returns the sorted unique elements of an array", I think this is a "can't fix": we cannot sort a mixture of strings and integers.

Object arrays are supported as long as the underlying objects are well behaved. In many cases object arrays work, but in this instance it does not. A set might be better suited to your use case: set([1, 1, 1, 'a', 'b']) -> set(['a', 1, 'b'])

@WarrenWeckesser
Copy link
Member

@Jerevia, thanks for reporting the issue. This issue is a duplicate of #641, so I'm closing it. Further discussion should be continued in the old issue, which is still open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants