[Bug]: temporal.group_average daily to monthly assigns wrong time index (and bounds) #586

durack1 · 2024-01-12T14:25:50Z

What happened?

I have been attempting to generate monthly mean from daily data, and expected the process-resolved time value to generate an index that is the middle of the month. The data example is 1 through 31 January 1998 daily files of SST. I would expect for a monthly average over these days [1998-01-01 00:00 ... 1998-01-31 00:00] the new time value represents the mean of the period e.g. [1998-01-16 12:00], whereas I get [1998-01-01 00:00], and no time_bounds assigned

What did you expect to happen? Are there are possible answers you came across?

I expected a new monthly mean value to have the time axis value [1998-01-16 12:00], which also includes time_bounds [1998-01-01 00:00, 1998-02-01 00:00].

Working example included in the Jupyter Notebook

Minimal Complete Verifiable Example (MVCE)

Example calculates same January 1998 monthly average is included in the Jupyter Notebook (above)

Relevant log output

See Notebook above

Anything else we need to know?

ping @taylor13, @gleckler1, @lee1043 who I have been discussing this behaviour with

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.90.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2023.12.0
pandas: 2.1.4
numpy: 1.26.3
scipy: 1.11.4
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: 1.3.7
dask: 2023.12.1
distributed: 2023.12.1
matplotlib: 3.8.2
cartopy: 0.22.0
seaborn: None
numbagg: None
fsspec: 2023.12.2
cupy: None
pint: None
sparse: 0.15.1
flox: None
numpy_groupies: None
setuptools: 69.0.3
pip: 23.3.2
conda: None
pytest: None
mypy: None
IPython: 8.20.0
sphinx: None

The text was updated successfully, but these errors were encountered:

pochedls · 2024-01-12T16:21:35Z

This is essentially the same as #565. PRs welcome.

pochedls · 2024-01-12T17:45:58Z

@durack1 – as a workaround, you can probably use ds.bounds.add_time_bounds() to create bounds after your temporal averaging operation. If the time point is important, you could also average the resulting time bounds and reassign them to ds.time.

taylor13 · 2024-01-12T19:23:51Z

To be sure. Regarding the statement made above: "The data example is 1 through 31 January 1998 daily files of SST. I would expect for a monthly average over these days [1998-01-01 00:00 ... 1998-01-31 00:00]".

I think the monthly average should be for all data representing intervals partially or fully within the bounds [1998-01-01 00:00 and 1998-02-01 00:00], so if the the coordinates values represent the beginning of daily means (rather than the middle of the day), then indeed data values for coordinates intervals associated with the coordinate values [1998-01-01 00:00 ... 1998-01-31 00:00] will be simply averaged together (no weighting required) to yield a monthly mean extending from 1998-01-01 00:00 to 1998-02-01 00:00. If, however, the coordinates represent the middle of each daily interval, then the first daily sample represents an interval from 1997-12-31 12:00 through 1998-01-01 12:00 and the monthly mean will include 32 daily samples (rather than 31) with the first and last samples included but with half the "weight" of the others. There is further discussion of this at #565.

durack1 · 2024-01-12T19:48:08Z

This is essentially the same as #565. PRs welcome.

Agreed, this is effectively a dupe although focuses on daily -> monthly, vs monthly -> annual #565. duplicate label assigned

pochedls · 2024-01-12T19:50:25Z

#565 also seems to be a duplicate of #534, which is related to #529.

durack1 · 2024-01-12T20:15:34Z

#565 also seems to be a duplicate of #534, which is related to #529.

@pochedls it seems in #529 there was an xarray version dependence. Is there a test that would catch issues in the case upstream (xarray) changes impact the temporal_average function updates?

durack1 · 2024-01-12T22:46:44Z

@durack1 – as a workaround, you can probably use ds.bounds.add_time_bounds() to create bounds after your temporal averaging operation. If the time point is important, you could also average the resulting time bounds and reassign them to ds.time.

I had tried this (in the Notebook) but got the error
ValueError: Cannot generate bounds for coordinate variable 'time' which has a length <= 1 (singleton).

The code I attempted was ds.bounds.add_time_bounds(method="freq", freq="month") not sure if this was some noob error

pochedls · 2024-01-12T22:54:13Z

I think it is generally not possible to automatically generate bounds if there is only one time value, but I think it could be done in this case (you are specifying the frequency). This would require revisiting this block of code.

durack1 added the duplicate This issue or pull request already exists label Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: temporal.group_average daily to monthly assigns wrong time index (and bounds) #586

[Bug]: temporal.group_average daily to monthly assigns wrong time index (and bounds) #586

durack1 commented Jan 12, 2024 •

edited

pochedls commented Jan 12, 2024

pochedls commented Jan 12, 2024

taylor13 commented Jan 12, 2024 •

edited by durack1

durack1 commented Jan 12, 2024

pochedls commented Jan 12, 2024

durack1 commented Jan 12, 2024

durack1 commented Jan 12, 2024

pochedls commented Jan 12, 2024

[Bug]: temporal.group_average daily to monthly assigns wrong time index (and bounds) #586

[Bug]: temporal.group_average daily to monthly assigns wrong time index (and bounds) #586

Comments

durack1 commented Jan 12, 2024 • edited

What happened?

What did you expect to happen? Are there are possible answers you came across?

Minimal Complete Verifiable Example (MVCE)

Relevant log output

Anything else we need to know?

Environment

INSTALLED VERSIONS

pochedls commented Jan 12, 2024

pochedls commented Jan 12, 2024

taylor13 commented Jan 12, 2024 • edited by durack1

durack1 commented Jan 12, 2024

pochedls commented Jan 12, 2024

durack1 commented Jan 12, 2024

durack1 commented Jan 12, 2024

pochedls commented Jan 12, 2024

durack1 commented Jan 12, 2024 •

edited

taylor13 commented Jan 12, 2024 •

edited by durack1