Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ids:Artifact and resourceEndpoints #481

Open
tomkxy opened this issue Jul 21, 2021 · 6 comments
Open

ids:Artifact and resourceEndpoints #481

tomkxy opened this issue Jul 21, 2021 · 6 comments
Milestone

Comments

@tomkxy
Copy link

tomkxy commented Jul 21, 2021

While looking into mapping from DCAT to IDS I stumbled across an aspect in IDS which feels cumbersome and to be honest I do not understand:
In DCAT we have Datasets and Distribution which relates to IDS as DataResources and DataRepresentations / Artifacts , so far so good.
In DCAT, the Distribution contains the information where to get the distribution from downloadURL etc.

Now trying to figure out from an IDS Dataresource where I can retrieve the resource from:
On the one hand side each data resource may have many data representations and many artifacts. Thus, I would have expected to have a prop on the artifact to the endpoint where I can get the artifact from.

Instead, there is a another structure on data resource level, the ids:resourceEndpoint which holds the artifact again (same as above) and an Endpoint (e.g. ConnectorEndpoint). I don't understand, why this is modeled like that, since the endpoints seems to be clearly related to the artifact. Btw. this makes life cycle management of a resource on a GUI rather complicated.

@clange
Copy link
Member

clange commented Sep 15, 2021

My initial understanding is the following. @HaydarAk could you please add your (probably deeper) understanding to this discussion?
Generally, one resource may be served through one or more endpoints, i.e., ids:Resource – ids:resourceEndpoint → ids:ConnectorEndpoint.
In certain special cases, an endpoint may be dedicated to serving one artifact (i.e., one instance) of one representation of the resource, but this is optional to model explicitly: ids:ConnectorEndpoint – ids:endpointArtifact → ids:Artifact.

Now @tomkxy for your use case: Are you assuming that you found, e.g., in a Connector's Catalog, is the resource, and the resource has representations (ids:DigitalContent – ids:representation → ids:Representation), which have artifacts (ids:Representation – ids:instance → ids:RepresentationInstance), and then for one specific artifact A you would like to know where you can download that? And you wouldn't want to query the whole metadata graph for the ids:ConnectorEndpoint that serves A, but would prefer following a direct link from A to that endpoint? – This should be feasible to implement. @HaydarAk what do you think?

@HaydarAk
Copy link

All correct what you wrote, @clange. One addition: As far as I know, the ids:resourceEndpoint property is especially relevant for, e.g., the broker, because it is the one and only property which allows to link a resource to a corresponding connector (endpoint), if queried at a broker. It should definitely be feasible to implement that. But we have to consider "what" to describe "where" at "which" level of detail to satisfy all requierments.

In DCAT, the distribution contains the information where to get the data from downloadURL etc.

[...] and then for one specific artifact A you would like to know where you can download that? And you wouldn't want to query the whole metadata graph for the ids:ConnectorEndpoint that serves A, but would prefer following a direct link from A to that endpoint?

On high level, this sounds good. But we might have to look into it in detail. We could add "access-related" information to Representations and therefore point to ConnectorEndpoints, similar to the ids:resourceEndpoint property of ids:Resource.

This is some extent similar to the DCAT approach. DCAT distributions refer to:

  • downloadURL --> some URI:   For assets which can be directly downloaded.
  • accessURL --> dcat:DataService:   dcat:DataService is what the name suggests. a Service to retrieve data. One could argue that a Connector (endpoint) is a service as well.

By looking into the dcat:DataService class I noticed that the class does not contain information about a distribution but a dataset ( see here ). In IDS terms this would translate into:

ids:Representation or ids:Artifact --> served by --> ids:Connector, ideally ids:ConnectorEndpoint
ids:ConnectorEndpoint --> serves resource --> ids:Resource

I am not 100% sure, if a full adoption of the DCAT approach is something we want because it is a very data-centric model.

Feel free to share your thoughts :)

@HaydarAk HaydarAk added this to the 5.0.0 milestone Oct 7, 2021
@tomkxy
Copy link
Author

tomkxy commented Nov 26, 2021

Thanks for the explanation and sorry my late reply. Looking at the topic again while our devs are trying to do implementation on the GUI, we realize the following. Let's assume the following use case: You have a data resource with a couple of representations each representation having one artifact. Now you want to display this on a GUI utilizing the generated Java classes.

You display the attributes of representation and artifact by following from the resource-> representation (via ids:representation) -> artifact (via ids:instance).
So far so good. But what you do now to display with each representation for instance accessURLs which are stored in the connectorEndpoint. How do you get that info for a specific representation?

If I am not completely mistaken you need to find the connectorEndpoint instance which is pointing through the artifact to the representation given.
With a SPARQL query that is not big deal because you can navigate the graph, but please try that with the Java classes which are generated out of the infomodel. Here the relations are kind of directed and there is no easy way to figure out all properties related to a specific representation / artifact, including the properties from the resourceEndpoint which is serving this artifact.

I hope, I could explain the issue clearly

@sebbader
Copy link
Contributor

sebbader commented Dec 2, 2021

Hello together, after a short discussion with @tomkxy I feel finally qualified to also put my two cents. I think the problem is that as soon one has traversed to a Representation or Artifact, the only way to get the accessUrl by following an edge backwards. This is possible via SPARQL but not - for instance - the Java implementation where only one-way lookups are possible.
We had the property at the Artifact already:

#ids:accessUrl # since domain of dcat:accessURL is rdfs:Resource not applicable as super-property, URL is not a resource

I am not saying to reintroduce it here, I just want to point out that we saw the necessity already earlier :-).

@sebbader
Copy link
Contributor

sebbader commented Dec 2, 2021

I don't have a good idea yet. Adding redundant (potentially conflicting) attributes doesn't sound good to me.

@JohannesLipp
Copy link
Member

@tomkxy sorry for the late reply. Could you please comment if the issue still remains and still needs to be tackled? Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Information Model
Awaiting triage
Development

No branches or pull requests

5 participants