Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combine output from scalar and time series to a single file #321

Open
Tracked by #327
kellijohnson-NOAA opened this issue Oct 7, 2021 · 3 comments
Open
Tracked by #327
Labels
topic --- results pertains to generating, reading, or visualizing the results type --- feature feature requests and enhancement

Comments

@kellijohnson-NOAA
Copy link
Contributor

Current problem

The original results for ss3sim were split into two files because the size of the file for the time series was what we thought was very large. I think R has more resources now for large files and that should not be the limiting factor here.

Values in the OM can be time varying but the scalar csv would not capture this as currently written. I am not even sure if the time series results file would show time varying parameters for the OM. Many times I have found the need to combine these two files. And, wrote one-off code to do so.

Proposed solution

I propose that we combine the information in the ts and scalar file into a single results file where all outputs have a time associated with them. Scalars would have the same value for all points in time or be assigned a time period of all or something like that.

Potential problems

  • large files
  • longer simulation times because 1 file might be bigger than similar information in two files
  • confusion amongst users regarding the change
@k-doering-NOAA
Copy link
Contributor

Ah, this makes sense. There is also a third file, dq (for derived quantities), perhaps this should also be added?

Another idea would be to allow users to choose whether or not to output as .csv; they could instead be output as .Rdata, or simple the R objects could be created. I'm not sure if the latter 2 options would save time or not

@kellijohnson-NOAA
Copy link
Contributor Author

Yes to needing the output from derived quantities as well, thanks for the reminder.

Regarding csv versus Rdata, I am not sure which is faster or what is the best option out of all of the available options. I do like being able to open the csv and see the results without having to do extra work of saving an object to the disk myself. I think I would be too lazy to save the file. I think we could save by not exporting files for every folder, which would make users re-read the results for all folders every time. Then, we could put more emphasis on making SS_output faster. It would also decrease the number of files in the simulation and help make old results more compatible with new results because you wouldn't have to worry about supporting old results files because they wouldn't be saved.

@k-doering-NOAA
Copy link
Contributor

I support the decision to not write out csv files for each scenario, it really is just duplicate info.

I also like that .csvs are automatically created (and you can open them without reading into R), although I'm not sure everyone would agree? I think .Rdata files are smaller than .csvs, but have never tested this myself.

@kellijohnson-NOAA kellijohnson-NOAA added the topic --- results pertains to generating, reading, or visualizing the results label Oct 16, 2021
@kellijohnson-NOAA kellijohnson-NOAA added type --- feature feature requests and enhancement and removed type: enhancement labels Dec 3, 2021
@k-doering-NOAA k-doering-NOAA self-assigned this Feb 17, 2022
@k-doering-NOAA k-doering-NOAA removed their assignment Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic --- results pertains to generating, reading, or visualizing the results type --- feature feature requests and enhancement
Projects
None yet
Development

No branches or pull requests

2 participants