Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in2csv: GeoJSON metadata discarded #870

Open
jayvdb opened this issue Jul 24, 2017 · 7 comments
Open

in2csv: GeoJSON metadata discarded #870

jayvdb opened this issue Jul 24, 2017 · 7 comments

Comments

@jayvdb
Copy link
Contributor

jayvdb commented Jul 24, 2017

The following metadata fields from the "FeatureCollection" object are discarded:

-  "generator": "overpass-ide",
-  "copyright": "....",
-  "timestamp": "2017-07-17T03:42:02Z",

Probably others also.

Perhaps they could be stored in a column header, or a special row, which would assist with Round-tripping data ( #868 ).

@jpmckinney
Copy link
Member

It's not clear how to store these in a way that still produces a generic CSV (rather than a CSV that only csvkit knows how to read), but I'll leave the issue open for creative suggestions.

@jayvdb
Copy link
Contributor Author

jayvdb commented Jul 25, 2017

I suggested embedding in column header.

@jpmckinney
Copy link
Member

I suppose it could be an opt-in flag - as otherwise most users will find it surprising to have extra columns in their CSV output.

@jayvdb
Copy link
Contributor Author

jayvdb commented Jul 26, 2017

I was not meaning an extra column; instead add it into a existing column header.

anyway ... a better idea would be to read the metadata from the original. Assuming csvkit has a sensible streaming json reader, it could read only as much of the original to capture the metadata at the top, and emit that in the output.

@jpmckinney
Copy link
Member

Yeah, my challenge is determining where to output it. in2csv with GeoJSON just outputs a row for each feature (as you know). If we put metadata in a special row or in a special header, then that pollutes the data with non-data that other tools won't know how to parse, because storing metadata like that in CSVs is not standardized. And then we'd have to add support to all csvkit tools for parsing that metadata to avoid it interfering with other operations.

@jayvdb
Copy link
Contributor Author

jayvdb commented Jul 26, 2017

Another, probably better idea, is adding an "append" / "merge" option, so that I can provide a very small .geojson file with the metadata to be used and then the extra records can be merged into it (in-place rather than stdout would be my preference).

Then in2csv could also emit the metadata as a small geojson file, containing any data it couldnt put into the csv.

@jpmckinney
Copy link
Member

That sounds reasonable. Like a --write-metadata option for in2csv, and a --read-metadata for csvjson.

@jpmckinney jpmckinney modified the milestone: 1.0.3 Jan 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants