Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Production deployment tracking #4

Open
cisaacstern opened this issue Aug 31, 2023 · 4 comments
Open

Production deployment tracking #4

cisaacstern opened this issue Aug 31, 2023 · 4 comments

Comments

@cisaacstern
Copy link
Member

Opening this issue as a place to track progress of production deployments on GCP Dataflow. So far:

  • The first deployment failed due to malformed urls: the make_dates function had some inaccurate assumptions built into it.
  • I fixed that problem in Validate filenames #3
  • The second production deployment began about ten minutes ago. I'll follow up here with updates.
@cisaacstern
Copy link
Member Author

The second production deployment began about ten minutes ago. I'll follow up here with updates.

This failed on another malformed url issue (really, a missing data issue), which had slipped through the cracks due to an error in the unit tests. I fixed this in #5, and redeployed on merge of that PR. 🤞 Updates to follow.

@cisaacstern
Copy link
Member Author

The third deployment failed with the error reported in #6, which hopefully fixes the issue.
The build is now deployed for a fourth time, updates to follow.

@cisaacstern
Copy link
Member Author

cisaacstern commented Sep 21, 2023

The fourth deployment proceeded well, caching 2655 of some 2900 or so files before stalling of unclear reasons.

I've just re-deployed from #7.

@cisaacstern
Copy link
Member Author

🎉 Success!

import xarray as xr

p = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/aqua-modis-feedstock/aqua-modis-682286948-6702057605-1/aqua-modis.zarr"
ds = xr.open_dataset(p, engine="zarr", chunks={})
ds.nbytes/1e9  # --> 757.954773176 GB
ds
<xarray.Dataset>
Dimensions:   (time: 967, lat: 4320, lon: 8640)
Coordinates:
  * lat       (lat) float32 89.98 89.94 89.9 89.85 ... -89.9 -89.94 -89.98
  * lon       (lon) float32 -180.0 -179.9 -179.9 -179.9 ... 179.9 179.9 180.0
  * time      (time) datetime64[ns] 2002-07-04 2002-07-12 ... 2023-07-20
Data variables:
    bbp_443   (time, lat, lon) float64 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
    chlor_a   (time, lat, lon) float32 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
    qual_sst  (time, lat, lon) uint8 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
    sst       (time, lat, lon) float64 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
Attributes: (12/39)
    Conventions:                      CF-1.6 ACDD-1.3
    cdm_data_type:                    grid
    creator_email:                    data@oceancolor.gsfc.nasa.gov
    creator_name:                     NASA/GSFC/OBPG
    creator_url:                      https://oceandata.sci.gsfc.nasa.gov
    easternmost_longitude:            180.0
    ...                               ...
    standard_name_vocabulary:         CF Standard Name Table v36
    suggested_image_scaling_applied:  No
    sw_point_latitude:                -89.97916412353516
    sw_point_longitude:               -179.9791717529297
    title:                            MODISA Level-3 Standard Mapped Image
    westernmost_longitude:            -180.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant