Not working set_index with drop #13649

VelizarVESSELINOV · 2016-07-14T01:30:40Z

Code Sample, a copy-pastable example if possible

from io import StringIO
from pandas import read_csv

dtf = read_csv(StringIO("DATE_TIME,A\n2/8/2015  6:00:30,1"))

print(dtf)

dtf.set_index(dtf.DATE_TIME, drop=True, inplace=True)
print(dtf.columns)
print(dtf)

Current output

           DATE_TIME  A
0  2/8/2015  6:00:30  1
Index(['DATE_TIME', 'A'], dtype='object')
                           DATE_TIME  A
DATE_TIME                              
2/8/2015  6:00:30  2/8/2015  6:00:30  1

Expected Output

           DATE_TIME  A
0  2/8/2015  6:00:30  1
Index(['A'], dtype='object')
                           A
DATE_TIME                              
2/8/2015  6:00:30  1

output of `pd.show_versions()`

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.18.1
nose: None
pip: 8.1.2
setuptools: 20.6.7
Cython: None
numpy: 1.11.1
scipy: 0.16.1
statsmodels: None
xarray: None
IPython: 4.0.1
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.0
openpyxl: 2.3.5
xlrd: 1.0.0
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: None
httplib2: 0.9.2
apiclient: 1.5.0
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
None

The text was updated successfully, but these errors were encountered:

sinhrks · 2016-07-14T01:35:59Z

thx, it looks to be a bug. if input is a Series sliced from original, corresponding column should be dropped.

works fine if we pass column name.

dtf.set_index('DATE_TIME', drop=True, inplace=True)
dtf.columns
# Index(['A'], dtype='object')

jreback · 2016-07-14T02:26:35Z

not a bug - this violates the guarantees of set_index

it's not valid to pass an actual column here -

its not the same as actually assigning the index

jreback · 2016-07-14T02:28:30Z

there is a PR where try to make this work - but it's inherently ambiguous

not even sure you could warn about this
(though it IS an error to use inplace and drop I think)

michaelaye · 2016-10-12T22:31:12Z

not a bug - this violates the guarantees of set_index

Could you elaborate what guarantee that is of set_index? I find it confusing if I specifically use drop=True and get no error when for some reason dropping is not allowed or possible.

jreback · 2016-10-12T23:32:49Z

@michaelaye

when you pass a list for the keys, it is by-definition setting the index. However, one possibly could think that [58] is the actual result of [57].

In [55]: df = pd.DataFrame({'A':range(2),'B':range(2),'C':range(2)})

In [56]: df
Out[56]: 
   A  B  C
0  0  0  0
1  1  1  1

In [57]: df.set_index(['A','B'])
Out[57]: 
     C
A B   
0 0  0
1 1  1

In [58]: df.index=['A','B']

In [59]: df
Out[59]: 
   A  B  C
A  0  0  0
B  1  1  1

In [54]: DataFrame.set_index?
Signature: DataFrame.set_index(self, keys, drop=True, append=False, inplace=False, verify_integrity=False)
Docstring:
Set the DataFrame index (row labels) using one or more existing
columns. By default yields a new object.

Parameters
----------
keys : column label or list of column labels / arrays
drop : boolean, default True
    Delete columns to be used as the new index
append : boolean, default False
    Whether to append columns to existing index
inplace : boolean, default False
    Modify the DataFrame in place (do not create a new object)
verify_integrity : boolean, default False
    Check the new index for duplicates. Otherwise defer the check until
    necessary. Setting to False will improve the performance of this
    method

Examples
--------
>>> indexed_df = df.set_index(['A', 'B'])
>>> indexed_df2 = df.set_index(['A', [0, 1, 2, 0, 1, 2]])
>>> indexed_df3 = df.set_index([[0, 1, 2, 0, 1, 2]])

Returns
-------
dataframe : DataFrame

ron819 · 2018-11-27T10:04:46Z

any plans to fix this?

sinhrks added the Bug label Jul 14, 2016

simonjayhawkins added the Error Reporting Incorrect or improved errors from pandas label Apr 24, 2020

simonjayhawkins added this to the Contributions Welcome milestone Apr 24, 2020

simonjayhawkins added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Apr 24, 2020

mroeschke added Indexing Related to indexing on series/frames, not to indexes themselves and removed Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels May 1, 2021

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not working set_index with drop #13649

Not working set_index with drop #13649

VelizarVESSELINOV commented Jul 14, 2016

sinhrks commented Jul 14, 2016

jreback commented Jul 14, 2016

jreback commented Jul 14, 2016

michaelaye commented Oct 12, 2016 •

edited

jreback commented Oct 12, 2016

ron819 commented Nov 27, 2018

Not working set_index with drop #13649

Not working set_index with drop #13649

Comments

VelizarVESSELINOV commented Jul 14, 2016

Code Sample, a copy-pastable example if possible

Current output

Expected Output

output of pd.show_versions()

sinhrks commented Jul 14, 2016

jreback commented Jul 14, 2016

jreback commented Jul 14, 2016

michaelaye commented Oct 12, 2016 • edited

jreback commented Oct 12, 2016

ron819 commented Nov 27, 2018

output of `pd.show_versions()`

michaelaye commented Oct 12, 2016 •

edited