Add Json-LD (de)serializer #487

amedranogil · 2018-03-13T13:32:39Z

Add Json-LD https://www.w3.org/TR/json-ld/ Serializer.

prerequisite: resolve multi-serializer servcie, and service selection for serialization.

amedranogil · 2018-06-18T08:25:08Z

How are we going to implement multiple instances of MessageContentSerializer? I'm guessing Filters will allow to select one or the other. But what do we put in the Filter?

amedranogil · 2018-06-18T08:27:43Z

MessageContentSerializer.deserialize(String) should be defined as trowing a ParseException. This will be useful when having several serializers and determine which language the string is in.

cstockloew · 2018-06-18T13:05:35Z

Sorry, there has been no progress on the serializer itself. But there is some progress in the preparation, e.g. the classes GraphIterator and Specializer. In the old version, the turtle serializer did everything by itself. The Specialization part is now provided by the Specializer in data.representation and can be used by different serializers (this was a big part). With Gson being a separate lib that is already included, adding JSON-LD should be a lot simper now.

A few things to consider (as far as I remember):

Format and Params:
There should be a new class SerializerParams in
data.representation/org.universAAL.middleware.serialization
with
- Definition of different formats: URIs as given in https://www.w3.org/ns/formats/ as public static final String FORMAT_XXX
- Method SharedObjectParams getParams(String format) to get the params for a call to fetchSharedObject (hence, my work on the container modules)
  The different serializers are indeed distinguished by OSGi as part of the 'properties' parameter
Serializer options
Different options to the serializer should be given as a separate 'options' parameter. See uaal_issues M7.
Serializer vs. SerializerEx
The 'Ex' subinterface should be removed. It was only a workaround. This would be a possibility for a serializer option. See uaal_issues M7.
getPropSerializationType:
Needs to be clarified. I'm waiting for an answer from Saied for a very long time now. Without further info on what this actually means, it is difficult to adapt this concept to different serialiers. See uaal_issues M8.
getPropSerializationFormat:
There could be another method in Resource to allow for nested formats. Rarely needed and should mostly just return null, but this would allow, e.g. to add SPARQL as property of a Resource. One serializer should then call another serializer. Maybe this solves the issue with ParseException, if you know what format the "outer" serilaization is (which is always RDF for the buses and in most other cases can be queried in another way).

I know you are asking for this feature for a very long time now. Sorry for not having this finished yet.

amedranogil · 2018-06-18T13:23:33Z

@cstockloew Thank you for your pointers, I am now starting the development of this Serializer; as you said, it seems to be easy.
Yet, I will need to understand some details that right now are a bit fuzzy for me (e.g: specialization, that as far as I understand it at this point is the last step of deserialization); as well as how to treat special types, such as type restrictions.
I will be using the current Serializer as template, thus it would be nice to have at least those ideas for all serializers implemented there (i.e: SerializerParams, the fetchSharedObject mechanism, the Serializer Options[this is also something I noticed, but I left it for another iteration])

amedranogil · 2018-06-18T19:51:06Z

I just pushed new branch at https://github.com/universAAL/middleware/tree/issue/487
Only started with URICompactor, This funcionality could be cross-serializer but I found difficult to extract it from Turtle serializer (Plus it guesses human readable prefix names).

cstockloew · 2018-06-19T14:28:42Z

The turtle serializer is stand-alone and thus does everything by itself. Maybe an external lib, like Gson, provides this already natively? If not, it would also be possible to "extract" some util-methods, either to data.representation or to another bundle, e.g. data.serialization.common.{core/osgi}?

The serializer works like this:
it first analyses all Resources (method analyzeResource) and counts all namespaces (among others) with

			if (StringUtils.isQualifiedName(uri))
				countNs(uri.substring(0, uri.lastIndexOf('#') + 1), nsTable);

The Hashtable nsTable maps the namespace (String) to the number of how often it is used (integer).

Later, when it comes to actually writing the output, it calls writeNamespaces(nsTable) which writes the namespaces to the output String and makes the overall mapping in HashTable namespaceTable. This namespaceTable is then used for every Resource that is written (method writeURI).

amedranogil · 2018-06-19T14:38:59Z

This function is exactly what URICompactor is doing, but with 2 added features:

it does not depend on the character ´#´; reading turtle it seems prefixes can also end in ´/´ or other non-alphanumeric value.
the compacted prefix is guessed from the URI, not by arbitrary order of processing.

amedranogil · 2018-06-30T17:31:09Z

I have developed other analyzers,
for example one that counts the blank nodes as to pad according to the total number of BNs. There is also a serializationType Analyzer which has the double function of counting the references (specially handy to determine if a resource should be embedded or not) and which type of serialization it has (essentially condensing the whole serialization policy, see #496 ). There is an Resource analysis framework which should be reusable, it is based on the GraphIterator Class (which pressents some issues with literals,as they are included in the analysis, maybe there should a subclass to avoid iterating through literal Resources)

As sanity check, allow me to enumerate the common things and/or "uAAL quirks" a serializer has to account for:

serialization type (Serialization types #496 )
Resources being marked as literals (not yet accounted for in JSON-LD, not sure either at the moment how to, part of the reason is the GraphIterator, the other is that there is no mention on the specs)
compacting URIs (probably more complex due to the two points above)
anonymous resource (switching from internal URI to_:BN)
Lists, concretely closedCollections
Class Types, currently the JSON serializer adds all in r.getTypes(), probably should filter abstract clases?

All other stuff is just listing properties and serializing them recursively. Of course theres the help of TypeMapper.getXMLInstance(o), which helps serialize primitives in XML (most other serializations follow).

amedranogil added the feature request label Mar 13, 2018

amedranogil mentioned this issue Mar 13, 2018

Unreliable documentation universAAL/platform#6

Closed

amedranogil self-assigned this Jun 18, 2018

amedranogil mentioned this issue Jun 29, 2018

REST Service Exporter / Importer universAAL/remote#495

Open

amedranogil closed this as completed Feb 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Json-LD (de)serializer #487

Add Json-LD (de)serializer #487

amedranogil commented Mar 13, 2018

amedranogil commented Jun 18, 2018

amedranogil commented Jun 18, 2018

cstockloew commented Jun 18, 2018

amedranogil commented Jun 18, 2018

amedranogil commented Jun 18, 2018

cstockloew commented Jun 19, 2018

amedranogil commented Jun 19, 2018

amedranogil commented Jun 30, 2018

Add Json-LD (de)serializer #487

Add Json-LD (de)serializer #487

Comments

amedranogil commented Mar 13, 2018

amedranogil commented Jun 18, 2018

amedranogil commented Jun 18, 2018

cstockloew commented Jun 18, 2018

amedranogil commented Jun 18, 2018

amedranogil commented Jun 18, 2018

cstockloew commented Jun 19, 2018

amedranogil commented Jun 19, 2018

amedranogil commented Jun 30, 2018