Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcd: how to easily obtain entire station ids? #380

Open
sckott opened this issue Nov 25, 2020 · 3 comments
Open

lcd: how to easily obtain entire station ids? #380

sckott opened this issue Nov 25, 2020 · 3 comments

Comments

@sckott
Copy link
Contributor

sckott commented Nov 25, 2020

Station ids given to the lcd() function are those ids for the files we access from NOAA, see e.g., files for year 2020 https://www.ncei.noaa.gov/data/local-climatological-data/access/2020/ - Each station id is a combination of the USAF (Air Force station ID) code followed by the WBAN (Weather-Bureau-Army-Navy number) code. ncdc_stations() only returns the wban part of the code.

how do we make it possible to search for and get entire station codes that lcd() requires?

@ecoflo
Copy link

ecoflo commented Apr 12, 2021

Have you considered :
stations <- ghcnd_stations(refresh = TRUE)
us.stations <- stations[grep("US", substr(stations$id, start = 1, stop = 2)),]

@sckott
Copy link
Contributor Author

sckott commented May 3, 2021

@ecoflo Can you show an example of how to get an ID lcd() wants from your example code?

@lpiep
Copy link

lpiep commented Nov 22, 2021

I've been looking at this issue because I'd like to use this package to download all US LCD data on a schedule. The LCD documentation indicates that its data is pulled from the Integrated Surface Database (ISD). I looked at the historical ISD data set (ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-history.csv), and it contains almost all of the stations in the LCD that aren't missing location data (and includes both the USAF and WBAN ids). Only two stations with complete information in the LCD didn't appear in the ISD data set (one in Massachusetts, one in Florida).

However, as of today, there are 5800 station in the ISD data that don't appear in the LCD data. I haven't been able to find a good way of figuring out which stations listed in ISD won't appear in LCD, so I'm not sure if this data set is what we want for listing all available LCD stations (though maybe it's still helpful for adding the USAF id into ncdc_stations's output).

I think we may have to build an inventory by actually reading through all available LCD files. That would take some time to run, but the data set could be built when the package is updated and included as an internal data set rather than being pulled from NOAA by the user.

Let me know if I can help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants