Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing method in xarray-schema #90

Open
giovp opened this issue Oct 10, 2022 · 4 comments
Open

parsing method in xarray-schema #90

giovp opened this issue Oct 10, 2022 · 4 comments

Comments

@giovp
Copy link

giovp commented Oct 10, 2022

hi,

thanks for the really great package. I was wondering if there is any plan to also provide methods for parsing array-like data following the pre-defined schemas. In particular, similar to what xarray-dataclasses provides.

possibly related to #12

thanks!

@jhamman
Copy link
Contributor

jhamman commented Nov 28, 2022

Hi @giovp! Can you explain a bit more about your use case? I'm not sure exactly what you mean by "parsing array-like data".

@jhamman
Copy link
Contributor

jhamman commented Dec 30, 2022

Closing this issue since we haven't managed to keep the conversation going. Feel free to reopen if there is more to discuss here.

@jhamman jhamman closed this as completed Dec 30, 2022
@giovp
Copy link
Author

giovp commented Feb 6, 2023

@jhamman apologies for late reply. What I had in mind is basically an additional parse method that tries to correctly casts the input data in the schema. For example, given a 2D array of an image, and some DataArraySchema, the parse method would return the schema-compliant xarray (otherwise error if cannot be casted correctly).

This is an example implementation that inherits the DataArraySchema

from typing import Any
from xarray_schema.components import (
    ArrayTypeSchema,
    DimsSchema,
)
from dask.array.core import from_array
import numpy as np
from xarray_schema.dataarray import DataArraySchema
from dask.array.core import Array as DaskArray
from xarray import DataArray

class RasterSchema(DataArraySchema):
    
    @classmethod
    def parse(
        cls,
        data,
    ) -> DataArray:
        if ImageModel.array_type.array_type == DaskArray:
            return DataArray(from_array(data), dims=cls.dims.dims)
        return DataArray(data, dims=cls.dims.dims)


class ImageModel(RasterSchema):
    dims = DimsSchema(("y", "x"))
    array_type = ArrayTypeSchema(DaskArray)

    def __init__(self) -> None:
        super().__init__(
            dims=self.dims,
            array_type=self.array_type,
        )

arr = np.random.normal(size=(100,100))
img = ImageModel.parse(arr)
ImageModel().validate(img) # None
img = img.rename({"x":"a"})
ImageModel().validate(img) # Error

I believe some of this tooling is implemented in https://github.com/astropenguin/xarray-dataclasses/ but was wondering if there are ways to converge.

@jhamman jhamman reopened this Feb 6, 2023
@jhamman
Copy link
Contributor

jhamman commented Feb 6, 2023

Thanks @giovp! This sounds similar in scope to what I wrote here: #45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants