Skip to content

ESM-VFC/intake_zenodo_fetcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intake Zenodo Fetcher: Fetch data from Zenodo based on Intake catalog entries

Binder

Why?

To improve reproducibility of workflows that use data archived on Zenodo, intake_zenodo_fetcher simplifies downloading a local copy of the data belonging to a Zenodo DOI. It offers functions to fetch all data baloginging to a Zenodo DOI and it allows for restricting the download to just those parts of the Zenodo record that belong to a catalog entry in an Intake catalog.

See examples/fetch_based_on_zenodo_doi.ipynb for a demo on pre-fetching the data based on a Zenodo DOI or try it live on mybinder.org.

Installation

Install with:

python -m pip install git+https://github.com/ESM-VFC/intake_zenodo_fetcher.git

Usage

intake_zenodo_fetcher populates the storage location indicate in the intake catalog with data from Zenodo, if the catalog entries have a metadata.zenodo_doi key.

For the following example entry "FESOM2_sample" pointing to https://zenodo.org/record/3865567

metadata:
  version: 1

plugins:
  source:
      - module: intake_xarray

sources:
  FESOM2_sample:
    driver: netcdf
    description: 'FESOM2 pi mesh Sample dataset'
    metadata:
      zenodo_doi: "10.5281/zenodo.3865567"
    args:
      urlpath: "{{ CATALOG_DIR }}/FESOM2_PI_MESH/*.fesom.1948.nc"
      xarray_kwargs:
        decode_cf: False
        combine: 'by_coords'

running

intake_zenodo_fetcher.download_zenodo_files_for_entry(
    cat['FESOM2_sample']
)

will make sure all files matching the glob pattern "*.fesom.1948.nc" are downloaded to ./FESOM2_PI_MESH/ in the directory that also contains the catalog.