Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: improved error message for badly downloaded file #1187

Open
aburrell opened this issue Mar 15, 2024 · 0 comments
Open

BUG: improved error message for badly downloaded file #1187

aburrell opened this issue Mar 15, 2024 · 0 comments
Labels
Milestone

Comments

@aburrell
Copy link
Member

Describe the bug
If a netCDF file is downloaded partially or has an incorrect format, the resulting error message can look as if xarray is installed incorrectly. This can be very confusing and time consuming for the poor person dealing with this error.

To Reproduce
Steps to reproduce the behavior:

  1. Figure out how to badly download a netCDF file
  2. Give it an appropriate pysat name for an instrument
  3. Attempt to load the file
  4. See error

Expected behavior
A warning that doesn't tell you xarray doesn't have the correct backend.

Screenshots

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 guvi.load(date=dt.datetime(2006, 3, 1))

File ~/Programs/Git/pysat/pysat/_instrument.py:3125, in Instrument.load(self, yr, doy, end_yr, end_doy, date, end_date, fname, stop_fname, verifyPad, **kwargs)
   3122     # Using current date or fid
   3123     self._prev_data, self._prev_meta = self._load_prev(
   3124         load_kwargs=kwargs)
-> 3125     self._curr_data, self._curr_meta = self._load_data(
   3126         date=self.date, fid=self._fid, inc=self.load_step,
   3127         load_kwargs=kwargs)
   3128     self._next_data, self._next_meta = self._load_next(
   3129         load_kwargs=kwargs)
   3130 else:

File ~/Programs/Git/pysat/pysat/_instrument.py:1659, in Instrument._load_data(self, date, fid, inc, load_kwargs)
   1657 load_fname = [os.path.join(self.files.data_path, f) for f in fname]
   1658 try:
-> 1659     data, mdata = self._load_rtn(load_fname, tag=self.tag,
   1660                                  inst_id=self.inst_id,
   1661                                  **load_kwargs)
   1663     # Ensure units and name are named consistently in new Meta
   1664     # object as specified by user upon Instrument instantiation
   1665     mdata.accept_default_labels(self.meta)

File ~/Programs/Git/pysatNASA/pysatNASA/instruments/timed_guvi.py:299, in load(fnames, tag, inst_id, combine_times)
    295     data, meta = jhuapl.load_edr_aurora(fnames, tag, inst_id,
    296                                         pandas_format=pandas_format,
    297                                         strict_dim_check=False)
    298 else:
--> 299     data, meta = jhuapl.load_sdr_aurora(fnames, name, tag, inst_id,
    300                                         pandas_format=pandas_format,
    301                                         strict_dim_check=False,
    302                                         combine_times=combine_times)
    304 return data, meta

File ~/Programs/Git/pysatNASA/pysatNASA/instruments/methods/jhuapl.py:257, in load_sdr_aurora(fnames, name, tag, inst_id, pandas_format, strict_dim_check, combine_times)
    252 inners = None
    253 for fname in fnames:
    254     # There are multiple files per day, with time as a variable rather
    255     # than a dimension or coordinate.  Additionally, no coordinates
    256     # are assigned.
--> 257     sdata, mdata = load_netcdf(fname, epoch_name=load_time, epoch_unit='s',
    258                                meta_kwargs={'labels': labels},
    259                                pandas_format=pandas_format,
    260                                decode_times=False,
    261                                strict_dim_check=strict_dim_check)
    263     # Calculate the time for this data file. The pysat `load_netcdf` routine
    264     # converts the 'TIME' parameter (seconds of day) into datetime using
    265     # the UNIX epoch as the date offset
    266     ftime = build_dtimes(sdata, '_DAY', dt.datetime(1970, 1, 1))

File ~/Programs/Git/pysat/pysat/utils/io.py:685, in load_netcdf(fnames, strict_meta, file_format, epoch_name, epoch_unit, epoch_origin, pandas_format, decode_timedelta, combine_by_coords, meta_kwargs, meta_processor, meta_translation, drop_meta_labels, decode_times, strict_dim_check)
    675     data, meta = load_netcdf_pandas(fnames, strict_meta=strict_meta,
    676                                     file_format=file_format,
    677                                     epoch_name=epoch_name,
   (...)
    682                                     meta_translation=meta_translation,
    683                                     drop_meta_labels=drop_meta_labels)
    684 else:
--> 685     data, meta = load_netcdf_xarray(fnames, strict_meta=strict_meta,
    686                                     file_format=file_format,
    687                                     epoch_name=epoch_name,
    688                                     epoch_unit=epoch_unit,
    689                                     epoch_origin=epoch_origin,
    690                                     decode_timedelta=decode_timedelta,
    691                                     combine_by_coords=combine_by_coords,
    692                                     meta_kwargs=meta_kwargs,
    693                                     meta_processor=meta_processor,
    694                                     meta_translation=meta_translation,
    695                                     drop_meta_labels=drop_meta_labels,
    696                                     decode_times=decode_times,
    697                                     strict_dim_check=strict_dim_check)
    699 return data, meta

File ~/Programs/Git/pysat/pysat/utils/io.py:1025, in load_netcdf_xarray(fnames, strict_meta, file_format, epoch_name, epoch_unit, epoch_origin, decode_timedelta, combine_by_coords, meta_kwargs, meta_processor, meta_translation, drop_meta_labels, decode_times, strict_dim_check)
   1023 # Load the data differently for single or multiple files
   1024 if len(fnames) == 1:
-> 1025     data = xr.open_dataset(fnames[0], decode_timedelta=decode_timedelta,
   1026                            decode_times=decode_times)
   1027 else:
   1028     data = xr.open_mfdataset(fnames, decode_timedelta=decode_timedelta,
   1029                              decode_times=decode_times, **combine_kw)

File ~/Library/Python/3.8/lib/python/site-packages/xarray/backends/api.py:524, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
    521     kwargs.update(backend_kwargs)
    523 if engine is None:
--> 524     engine = plugins.guess_engine(filename_or_obj)
    526 backend = plugins.get_backend(engine)
    528 decoders = _resolve_decoders_kwargs(
    529     decode_cf,
    530     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    536     decode_coords=decode_coords,
    537 )

File ~/Library/Python/3.8/lib/python/site-packages/xarray/backends/plugins.py:177, in guess_engine(store_spec)
    169 else:
    170     error_msg = (
    171         "found the following matches with the input file in xarray's IO "
    172         f"backends: {compatible_engines}. But their dependencies may not be installed, see:\n"
    173         "https://docs.xarray.dev/en/stable/user-guide/io.html \n"
    174         "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
    175     )
--> 177 raise ValueError(error_msg)

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html

Desktop (please complete the following information):

  • OS: OS X Big Sur
  • Version: Python 3.9 and 3.8
  • Other details about your setup that could be relevant: pysat RC 3.2.0, pysatNASA develop

Recommended solution
Add a try/except ValueError around the xarray loading, allow the xarray value error string to pass, but add an additional line saying "file 'filename here' may also be corrupted, check file before assuming xarray is not installed correctly.".

@aburrell aburrell added the bug label Mar 15, 2024
@jklenzing jklenzing added this to the 3.3.0 Release milestone Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants