Feature Request: Keep only these columns (vs. dropping all the ones you don't want)

Apologies if this has been submitted or considered in the past, I searched through the GitHub issues and couldn't find any information pertaining to this. 

The idea is that instead of specifying all of the columns that you wish to delete from a `DataFrame` via the `.drop` method, you specify instead the columns you wish to keep through a `.keep_cols` method - all other columns are deleted. This would save typing in cases where there are many columns, and we only want to keep a small subset of columns. The prime use case here is method chaining, where using `[[` doesn't really work in the middle of many methods being chained together. 

-------------------

#### A small, complete example of the issue

```python
import pandas as pd

# Create an example DataFrame
data = [
    [1, 'ABC', 4, 10, 6.3],
    [2, 'BCD', 10, 9, 11.6],
    [3, 'CDE', 7, 4, 10.0],
    [4, 'DEF', 7, 10, 5.4],
    [5, 'EFG', 2, 9, 5.3],
]
data = pd.DataFrame(data, 
    columns = ['Id', 'Name', 'Rating1', 'Rating2', 'ThisIsANumber'])

# Just want columns Id and Ratings2
new_data = data.drop(['Name', 'Rating1', 'ThisIsANumber'], axis = 1)
new_data.head()

# ** It would be nice to be able to only specify the columns we want 
# ** to keep to save typing - similar to dplyr in R             

def keep_cols(DataFrame, keep_these):
    """Keep only the columns [keep_these] in a DataFrame, delete
    all other columns. 
    """
    drop_these = list(set(list(DataFrame)) - set(keep_these))
    return DataFrame.drop(drop_these, axis = 1)

new_data = data.pipe(keep_cols, ['Id', 'Rating2'])
new_data.head()

# In this specific example there was not much more typing between
# `.drop` and the `keep_cols` function, but often when a `DataFrame`
# has many columns this is not the case!
```

In this contrived example I created a `keep_cols` function as a rough draft of a `.keep_columns` method to the `DataFrame` object, and used the `.pipe` method to pipe that function to the DataFrame as if it were a method. 

I don't think using `[[` cuts if here. Yes, doing `new_data[['Id', 'Rating2]]` would work, but when method chaining, people often want to drop columns somewhere in the middle of a bunch of methods. 

Just in case it's helpful, here's a good article demonstrating the power/beauty of method chaining in Pandas: https://tomaugspurger.github.io/modern-1.html. 

Thanks! 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Feature Request: Keep only these columns (vs. dropping all the ones you don't want) #14616

A small, complete example of the issue

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Keep only these columns (vs. dropping all the ones you don't want) #14616

Description

A small, complete example of the issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions