Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/ENH: modernize cosmic_gps methods #1

Open
1 of 3 tasks
jklenzing opened this issue Jun 9, 2020 · 13 comments · Fixed by #18
Open
1 of 3 tasks

BUG/ENH: modernize cosmic_gps methods #1

jklenzing opened this issue Jun 9, 2020 · 13 comments · Fixed by #18
Labels
breaking change bug Something isn't working enhancement New feature or request
Milestone

Comments

@jklenzing
Copy link
Member

jklenzing commented Jun 9, 2020

The current structure of the cosmic_gps instrument needs to be updated for the pysat 3.0.0 release. While working on bugs related to pysat/pysat#337, a number of areas to improve have been noted, including

  • Multiple spacecraft and antennas lead to possible "duplicate" timestamps. The enforcement of unique times on data requires the addition of fractional seconds to the data timestamps. While this does not affect some analysis methods (eg, climatology), a better long-term solution for this is to move to xarray.
  • This also affects the file lists. The current system solves this by adding fractional seconds (<0.01s in total) to timestamps both for the file lists. Because "duplicate" files are effectively allowed, a user could potentially download multiple versions of a data file over the lifetime of the mission and wind up loading all data sets, including old versions of the same file.
  • The download method and storing using the 'YYYY.DDD' format makes the parsing less clean. There's currently a workaround in BUG: recognize string variables in parse_delimited_files pysat#439, but this could be cleaned up with a breaking change.

Proposed updates:

  • Handle the data using xarray
  • Expand the fname string to include version numbers and potentially other keywords. May require updates in the core instrument object (breaking)
  • Improve the directory structure by letting pysat handle the day subdirectories (breaking)

ref: https://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=ionPrf

@jklenzing jklenzing changed the title BUG: cosmic_gps updates BUG/ENH: modernize cosmic_gps methods Jun 9, 2020
@jklenzing
Copy link
Member Author

jklenzing commented Aug 6, 2020

A possible route forward for when pysatCDAAC is created:

  • Add another dataset (GRACE?) that does not require username / password
  • generalize download / list_files, etc so that these routines can be tested

EDIT: created #3 for discussion of generalized method functions.

@aburrell
Copy link
Member

GRACE is a useful data set, I think this is a good idea.

@rstoneback
Copy link
Collaborator

rstoneback commented Aug 10, 2020 via email

@aburrell
Copy link
Member

aburrell commented Aug 10, 2020

That's a bad idea. I'd hate for pysat to get a bad rep because CDAAC got swamped after someone stole our account details. Unless there's a way to do it silently.

@jklenzing jklenzing transferred this issue from pysat/pysat Aug 13, 2020
@jklenzing jklenzing added bug Something isn't working enhancement New feature or request labels Aug 18, 2020
@rstoneback
Copy link
Collaborator

Download method has been updated for the public option and the subdirectories were updated to year/day in #17.

@rstoneback
Copy link
Collaborator

Version information added to the COSMIC file format string in #17.

@rstoneback rstoneback linked a pull request Apr 15, 2021 that will close this issue
@rstoneback
Copy link
Collaborator

xarray support added in #18.

@rstoneback
Copy link
Collaborator

Version information added to the COSMIC file format string in #17.

The code over in #17 has been improved since the first note. Now, time mangling depends entirely upon invariant parsed information from filenames, so newer versions of a file will get the same time offset and be recognized as a newer version.

@rstoneback
Copy link
Collaborator

Altitude_binning expanded across file types in #19

@jklenzing
Copy link
Member Author

Other options to consider:

  • rename platform to cosmic1 for consistency with categories on CDAAC website
  • Add inst_id options to separate postProc from reprocess2013 data. Currently, the code follows the website recommendation that the reprocess data is preferentially selected if available.

@rstoneback
Copy link
Collaborator

There is a user friendly aspect of defaulting to the latest data. Easier on the develop side if the files were more consistent.

I don't think I've tried this but we could potentially make an inception like Instrument, or a Russian nesting doll Instrument. Idea would be three instruments, one for the old and new datasets, and the third internally uses functions from each to get the job done? This option does give everyone what they want, users can select either the old or new datasets, or they could get the Instrument that auto selects whatever is most recent. Brainstorming thought at this point.

@jklenzing
Copy link
Member Author

We could use three separate inst_id values, one for postProc, one for reprocess, and one that combines the two. In a practical sense, there should not be overlap between days (if memory serves).

@rstoneback
Copy link
Collaborator

Sure. All a question of how big of a file do we want. More organizational than technical.

@jklenzing jklenzing added this to the 0.1.0 Release milestone Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants