Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add integrated surface database #871

Open
marvingabler opened this issue Feb 14, 2023 · 13 comments
Open

Feat: Add integrated surface database #871

marvingabler opened this issue Feb 14, 2023 · 13 comments

Comments

@marvingabler
Copy link

I was wondering if there are plans to integrate the ISD data into Wetterdienst yet. It is freely accessible and provides data on a global scale like the GHCN on an hourly basis. The data is also hosted on s3, which is available in raw format and in csv. If the csv's are converted valid (haven't checked yet), this could be a quick integration. For the raw format, there are some parsers (e.g. this one) available.

Dataset:
https://www.ncei.noaa.gov/products/land-based-station/integrated-surface-database

Extend:
Global

Temporal resolution:
Hourly

Publish delay:
1-2 days

Checking in after some time, it is great to see how fast this package is evolving!

@gutzbenj
Copy link
Member

Dear @marvingabler ,

thanks for the hint (and also the kind words)! I'll check out the data on the weekend and see if we can get it into wetterdienst quickly.

@gutzbenj
Copy link
Member

gutzbenj commented Feb 19, 2023

I just looked at the ish_parser and I think we'll have to add some adaptations there to get it nicely done.
Things to consider:

  • currently in the issues it was discussed to make the parser faster but ideas weren't yet applied
  • parsed data is stored in class attributes instead of dataclasses/dictionaries so access is quite limited

@marvingabler
Copy link
Author

Good points! We will be in need of the isd data probably in a few weeks, happy to pick that up if it's still open then

@amotl
Copy link
Member

amotl commented Feb 27, 2023

Dear Marvin,

thanks for your suggestion.

For the raw format, there are some parsers (e.g. ish_parser) available.

@gutzbenj said:

Currently in the issues it was discussed to make the parser faster but ideas weren't yet applied.

The issue @gutzbenj is referring to, is haydenth/ish_parser#20 by @vtoupet. On the matter of vectorized processing, I discovered pyisd by @gadomski, which is based on pandas -- cheers! I did not evaluate it yet, but one of us should do, and report back.

Not sure if isd-s3 helps, I believe the acquisition part will eventually be implemented by Wetterdienst.

With kind regards,
Andreas.

@amotl
Copy link
Member

amotl commented Feb 27, 2023

On the other hand, maybe @gadomski knows of any public sources which make ISD data available in other formats or through modern technologies like STAC or Zarr?

@gadomski
Copy link

Tl;dr: no, sorry

We were working on including it in the Planetary Computer but that work never made it over the finish line. I could build it into partitioned geoparquet (you can see some WIP notebook scribbles here: https://github.com/gadomski/chalkboard/blob/main/notebooks/isd-demo.ipynb), which worked ok, but keeping those partitions up-to-date with new data proved to be a bigger lift than it was worth, at the time.

@marvingabler
Copy link
Author

Thats a good point! Are you aware of any such sources that are providing open weather observations in such an easy to consume format (like meteostat but with STAC)? We recently had a chat at Jua regarding how awesome it would be if all the open weather data would be available in STAC. There are plenty of data sources that are not yet easily accessable via EE/Planetary Computer.

@gadomski
Copy link

https://stacindex.org is a decent reference for available STAC catalogs and tools, so you could look there. Also, the STAC Gitter can be a good place to ask as well. I personally don't know of much, but I'm not especially tuned in to the open weather community -- I'm more from the software+STAC world.

This is what's currently in the weather+climate tag on the Planetary Computer: https://planetarycomputer.microsoft.com/catalog#Climate/Weather

@amotl
Copy link
Member

amotl commented Feb 27, 2023

@marvingabler: You mean like Open-Meteo's Historical Weather API (blog article), but actually based on API/format standards, and not limited to ERA5?

-- https://github.com/open-meteo

@gutzbenj
Copy link
Member

Dear @gadomski ,

I'm currently considering integrating NOAA ISD into wetterdienst using your excellent work at pyisd. On that behalf would you be open taking in PRs to polish the library a bit using tools like poetry etc. and reconsidering/restructuring parts of the library?

Sincerely,
Benjamin

@gadomski
Copy link

On that behalf would you be open taking in PRs to polish the library a bit using tools like poetry etc. and reconsidering/restructuring parts of the library?

PR's always welcome. With respect to switching to poetry specifically, I'd be interested to see a justification -- I generally don't use poetry as I don't find that I need it. But any bug fixes or feature improvements would be quite welcome.

@gutzbenj
Copy link
Member

gutzbenj commented Mar 2, 2024

Dear @marvingabler ,

ISD finally made it into wetterdienst under the hood of NOAA GHCN-h /GHCN hourly [1] which has been released earlier this year in a first version and assembles the exact same data but way more conveniently accessible. Please give it a try! I just went through it only once got get a fast release but everything should be working for now :)

[1] https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C01688

@gutzbenj
Copy link
Member

gutzbenj commented Mar 3, 2024

Just figured that it currently only covers US stations -.- and currently there's an issue with the date parsing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants