Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/dot extra info #1415

Open
wants to merge 4 commits into
base: dev/php8
Choose a base branch
from

Conversation

peterjanssens
Copy link
Contributor

hi,
I would like to propose a change to the handling of InformationService-values that would allow retrieval of deeper nested extra-info.

We have some use-cases for this on AAT and some custom InformationServices where we would like to consume more information from the informationservice API(s), but in the current state of InformationServices value handling we would have to flatten out that info in more-info because it retrieves only one level deep.

For this demo, I added more_info to the AAT service as an example, where more localised info is gathered for instance for https://vocab.getty.edu/aat/300212906 through an extra request to https://vocab.getty.edu/aat/300212906.json

The AAT's original request gets the preferred label from Getty which is generally in plural form, and some of our partners would rather see the singular nouns. We could do this through customised SPARQLs in custom IS plugins or through SPARQL field UI ('bug' remark here: the SPARQL form is blanked/unvailable after saving when the SPARQL service field is added to a container, so you can still use it but you cannot edit the configurations afterward without moving the field out of the container first :/), but those specific SPARQL's tend to get quite labourous (as you will see in the demo there are many alternative labels for an AAT term that could be singular where SingularNoun and AlternateDescriptor are the most likely suspects, but unpredictable which are available, if any).
Sometimes a singular noun isn't available yet, and in those cases the author now would fall back to its own lists or fields; while the term itself could have been documented already with the existing authority term (and later available localised terms could be (periodically) sync'ed later on).

So in this extended AAT the author could indeed select the appropriate (available) term, and get all (current) localised terms with it through for instance ^ca_objects.aat.identified_by.nl.AlternateDescriptor or ^ca_objects.aat.identified_by.nl.SingularNoun. It also handles being in a container ^ca_objects.aat_contained.aat.identified_by.nl.Descriptor

(The current alternative would need something like ^ca_objects.aat.identified_by_nl_AlternateDescriptor, or else you would get the lookup (preferred) value itself)

image

Besides the localized labels, there is access to the available languages and alternative label descriptors too (unserialized value_blob), for instance ^ca_objects.aat.languages

Another use-case is for instance nature classification (like https://www.catalogueoflife.org/data/taxon/QLXL) where there is deeply nested family information (kingdom: [Animalia]>phylum: [Chordata]>subphylum: [Vertebrata]>infraphylum: [Gnathostomata]>parvphylum: [Osteichthyes]>gigaclass: [Sarcopterygii] >megaclass: [Tetrapoda]>superclass: [Amniota]>class: [Mammalia Linnaeus, 1758]>subclass: [Theria Parker & Haswell, 1897]>infraclass: [Eutheria Gill, 1872]>order: [Carnivora Bowdich, 1821] >suborder: [Caniformia Kretzoi, 1938]>family: [Canidae Fischer, 1817]>genus: [Canis Linnaeus, 1758]>species: dog :)

Having all that information readily available from the InformationService in more_info without flattening that hole tree is nice.
(in our custom plugin for COL we collect even more data through multiple API's from other related sources). In which case you could do something like ^ca_objects.col.hierarchy.megaclass or ^ca_objects.col.urls.wikipedia_url

We could retrieve the alternate labels when you need them in other ways (for instance js in the frontend views) but that makes it harder for some applications (like pdf prints/reports, elastic/solr facetting, ...)

I choose AAT as the source for the demo, doesn't have to end up to the original one, we could have it in a parallel/custom one, shared or not (just noticed that I forgot to add the link to Getty in the extended view). But having a very minimalistic implementation of dotnotation-like access to more_info would be very nice to have (so the SearchResult bit, that should be compatible even when you don't use nested fields like the example wikipedia.abstract).

In our own POC I've added some minimale extra styling, but I have left that out for simplicity of this PR. But it would be stylable to the likes.

image

There can be some unventured caveats still (on sync'ing maybe, or when the more_info contains keyless array structures like ^aat.some.array[3].oeps) but maybe that's to the adventurous custom developer to tackle :p

I didn't implement any specific dotnotation packages for this POC, maybe later if further implementation (other fields ?) could benefit from that.

@peterjanssens
Copy link
Contributor Author

peterjanssens commented May 5, 2023

one other remark to add, as you can see there can be multiple chinese references (zh). For the presentation I have abbreviated them because they messed up the concise language column. If you would get some to those you can hover the language labels and see their different full literals (like zh-latn-pinyin-x-notone, so that one would be ^ca_objects.aat.identified_by.zh-latn-pinyin-x-notone.SingularNoun)

And also here the caveat, for AAT there are many (sub)languages and alternative labels so there could be edgy cases (but then we'll handle those when we find out)

@peterjanssens peterjanssens changed the base branch from develop to dev/php8 July 31, 2023 11:56
@peterjanssens
Copy link
Contributor Author

changes to app/lib/Attributes/Values/InformationServiceAttributeValue.php were lost in the original PR (sorry for that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant