Refresh cached metadata #95

pmav99 · 2023-07-24T09:35:08Z

In IOC, COOPS and USGS we are caching the retrieved metadata. This is really useful for e.g. running the tests, but it can be problematic for long running processes (in the range of days/weeks/months). The first call will cache the metadata and, currently, there is no easy way to update the metadata.

I was thinking that we should add an extra argument in get_*_stations() functions similar to refresh_cache: bool = False. This way we will keep the existing behavior, and if someone needs to refresh the cache, they will be able to do so.

As far as the actual implementation goes, we would need something like this: https://stackoverflow.com/a/37654201/592289

pinging @brey @SorooshMani-NOAA

The text was updated successfully, but these errors were encountered:

brey · 2023-07-24T12:10:50Z

I like the idea. We need to establish a threshold for the refresh to kick in. Ideally, this should be internal and not visible to the user, although a warning/info comment might be required for transparency.

Maybe we need also to document how the users should achieve persistence in the usage of searvey if that is required.

SorooshMani-NOAA · 2023-07-24T12:20:30Z

Having the ability to reset helps. Ideally this should be available as an automatic operation (e.g. per day/hour/etc.) for non-developer users and as manual ability to reset for others. We already know that calling cache_clear can be used for the manual part, but for automatic this is an interesting idea:
https://stackoverflow.com/questions/31771286/python-in-memory-cache-with-time-to-live

There's also this package:
https://cachetools.readthedocs.io/en/latest/
Although maybe let's think twice before adding more dependencies

pmav99 · 2023-07-24T14:02:30Z

WRT to persisting searvey's metadata, we are using standard (geo)pandas, therefore I don't think we need to provide a specific API for this. Adding a note in the docs and/or example in the notebooks wouldn't necessarily be a bad idea though.

I didin't think of automatically invalidating the cache after some time, but I agree it is a good idea, and that SO answer seems to provide a rather elegant way of doing so without introducing any 3rd party dependencies. WRT to adding a runtime warning I am -1 to be honest. For sure we should document it but a warning each time you call a functions seems to be too much. Moreover 3 warning when you call searvey.get_stations() etc...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refresh cached metadata #95

Refresh cached metadata #95

pmav99 commented Jul 24, 2023

brey commented Jul 24, 2023

SorooshMani-NOAA commented Jul 24, 2023

pmav99 commented Jul 24, 2023

Refresh cached metadata #95

Refresh cached metadata #95

Comments

pmav99 commented Jul 24, 2023

brey commented Jul 24, 2023

SorooshMani-NOAA commented Jul 24, 2023

pmav99 commented Jul 24, 2023