Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csvcut: Allow for numeric names of columns (--column-names, --column-indexes) #468

Open
fgregg opened this issue Dec 2, 2015 · 7 comments

Comments

@fgregg
Copy link
Contributor

fgregg commented Dec 2, 2015

I have data with a column names like

pre-k,k,01,02,03,04,05

where the numbers refer to grades. I can't select the 3rd grade column using csvcut

csvcut -c "k,03" returns the kindergarten row and the 2nd grade row (the fourth column)

@fgregg
Copy link
Contributor Author

fgregg commented Dec 15, 2015

@onyxfish would you accept a PR that changed the behavior so that if the number was quoted it would interpret it as a string.

@jpmckinney
Copy link
Member

Can you give an example of what the command would look like?

@onyxfish
Copy link
Collaborator

I like to have a solution for this. (I've seen plenty of spreadsheets with year columns.) However, I don't want to rely on quote characters on the command line, as they have a particular meaning that may vary from platform to platform.

@jpmckinney
Copy link
Member

echo "pre-k,k,01,02,03,04,05" | csvcut -c "k,03"

Expected:

k,03

Actual:

k,01

@onyxfish onyxfish changed the title csvcut should allow for numeric names of columns csvcut: should allow for numeric names of columns Dec 29, 2016
@jpmckinney
Copy link
Member

jpmckinney commented Jan 30, 2017

What if we added explicit --column-names and --column-indices options to all tools (cut, grep, join, sort, stat) that accept --columns (-c) to give users an opportunity to be unambiguous?

@daaugusto
Copy link

@jpmckinney , what is the status of the proposed --column-names/--column-indices options? This would solve another ambiguity: csvcut cannot handle column names that contain dashes, as in "age-mean" because it thinks the argument is a range (even for quoted column names); the error is the following:

Invalid range %s. Ranges must be two integers separated by a - or : character.

With --column-names there would be no ambiguity because csvcut would not even try to parse ranges.

@jpmckinney
Copy link
Member

jpmckinney commented Sep 8, 2018

I've increased the issue's priority, so that I notice it next time I do a round of maintenance – but I can't make any promises about how soon I'll implement it. I accept pull requests, though!

@jpmckinney jpmckinney added feature and removed bug labels Apr 28, 2019
@jpmckinney jpmckinney modified the milestones: Next version, Priority Oct 17, 2023
@jpmckinney jpmckinney changed the title csvcut: should allow for numeric names of columns csvcut: Allow for numeric names of columns (--column-names, --column-indexes) Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants