Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add array class example #190

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
Draft

Add array class example #190

wants to merge 11 commits into from

Conversation

rly
Copy link
Contributor

@rly rly commented Mar 10, 2024

This PR adds an alternate approach to specifying the TemperatureDataset defined in tests/input/examples/schema_definition-native-array-1.yaml. This approach uses classes that implement linkml:NDArray and have an attribute that implements linkml:elements as defined pre-1.7.0 release. This representation is necessary for adding additional attributes, e.g., user-specified units of measurement, conversion factor, precision/error, reference/zero point, or source, on the various arrays that make up a TemperatureDataset. Seeking feedback on this approach.

It also changes y -> "y" because y = True in YAML 1.1.

@rly rly marked this pull request as draft March 10, 2024 01:22
range: string
latitude_in_deg:
implements:
- linkml:axis
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary anymore given the above array_data_mapping?

range: LatitudeSeries
required: true
annotations:
axis_index: 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary anymore given the above array_data_mapping?

axis_index: 2
temperatures_in_K:
implements:
- linkml:array
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this in the metamodel? Is this scoped to linkml:DataArray or unique across LinkML?

@rly
Copy link
Contributor Author

rly commented Mar 10, 2024

cc @sneakers-the-rat @cmungall

I'm looking at how to 1) add attributes to arrays and 2) support labeling arrays with other arrays.

For 1), I think we need to keep supporting classes that implement linkml:NDArray and allow those to be an axis in a linkml:DataArray. Alternatively, we could add attributes to TemperatureDataset and have them share a common prefix, e.g., latitude_in_deg__precision, but this is kinda ugly and relies on a naming convention. Note that numpy arrays and other simple array formats do not allow attributes, but HDF5 (and netCDF4) datasets and Zarr arrays do. Attributes are also allowed in xarray.DataArray.

For 2) if we allow a linkml:NDArray to be a labeled dimension of a linkml:DataArray, then because the NDArray class could contain multiple arrays, we need a way to identify the intended array within the class. We can keep doing that with linkml:elements. Alternatively, we could make values a special slot name for any class that implements linkml:NDArray. Or change the slot name in the example to elements. Or change linkml:elements to linkml:values. (In NWB, the Data class for arrays defines a required data field.)

@sneakers-the-rat
Copy link
Contributor

sneakers-the-rat commented Mar 10, 2024

ah yes, is it time for the second leg, the indexed array spec?

Can we get a few examples of the desired datasets we want to support with this? I think it might be helpful to have a few concrete test cases here so we can get a handle on the constraints we'll need to handle. are the ones in linkml-arrays still current? From that we can generate a set of requirements and constraints that help inform these decisions ^ :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants