-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Open
Labels
Description
Code Sample, a copy-pastable example if possible
This code fails for any K
:
# Your code here
K = 100
pd.qcut([0] * K + [1] * (K + 1), 2)
Problem description
With pandas 0.19.2, I have:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-782385490865> in <module>()
----> 1 pd.qcut([0] * K + [1] * (K + 1), 2)
pandas/tools/tile.py in qcut(x, q, labels, retbins, precision)
173 bins = algos.quantile(x, quantiles)
174 return _bins_to_cuts(x, bins, labels=labels, retbins=retbins,
--> 175 precision=precision, include_lowest=True)
176
177
pandas/tools/tile.py in _bins_to_cuts(x, bins, right, labels, retbins, precision, name, include_lowest)
192
193 if len(algos.unique(bins)) < len(bins):
--> 194 raise ValueError('Bin edges must be unique: %s' % repr(bins))
195
196 if include_lowest:
ValueError: Bin edges must be unique: array([0, 1, 1])
Expected Output
We need some kind of option to decide how to assign values to a quantile bucket in the event that two quantiles have the same value prior to the searchsorted
call. In this case, the appropriate behavior may be to assign all 1
values to the 50% quantile bucket.