Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add index_col parameter to the Dataframe constructor #58409

Open
1 of 3 tasks
tomhoq opened this issue Apr 24, 2024 · 1 comment
Open
1 of 3 tasks

ENH: Add index_col parameter to the Dataframe constructor #58409

tomhoq opened this issue Apr 24, 2024 · 1 comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement

Comments

@tomhoq
Copy link
Contributor

tomhoq commented Apr 24, 2024

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Currently, the only way to set a DataFrame's index column is by calling the set_index method on it.

data = {
        'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data)
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston

df = df.set_index("Name")
print(df)

         Age         City
Name
Alice     25     New York
Bob       30  Los Angeles
Charlie   35      Chicago
David     40      Houston

Feature Description

I propose adding a new parameter to the Dataframe constructor: index_col.

data = {
        'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data, index_col="Name")
print(df)

         Age         City
Name
Alice     25     New York
Bob       30  Los Angeles
Charlie   35      Chicago
David     40      Houston

With this addition you would be able to remove/change the default index column without having to call the set_index method, simplifying code. It would also make pandas overall more coherent as for example, the read_csv and read_excel, both return a DataFrame and already have this functionality in them.

To add upon this, I would also propose adding an index_col to the read_parquet function, working in a similar way.

Alternative Solutions

n/a

Additional Context

No response

@tomhoq tomhoq added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 24, 2024
@mroeschke
Copy link
Member

Thanks for the suggestion but I would be -1 on this feature. index_col seems to be highly tailored to when data is a mapping and wouldn't make much sense for other list-like data. I believe using set_index is an adequate alternative

@mroeschke mroeschke added Closing Candidate May be closeable, needs more eyeballs and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement
Projects
None yet
Development

No branches or pull requests

2 participants