Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Helper/Utility Functionality (Merging, Clearing Empty, Value Type Promotion) #164

Open
Turnerj opened this issue May 3, 2020 · 2 comments
Assignees
Labels
enhancement Issues describing an enhancement or pull requests adding an enhancement. help wanted Help wanted from the community.

Comments

@Turnerj
Copy link
Collaborator

Turnerj commented May 3, 2020

Describe the feature

There is a set of functionality that I'm working on that I think would be a good fit for the library as utilities for working with Schema objects in certain ways.

Merging
It would be useful to be able to merge appropriate objects and their properties together (where applicable) and use other known "ID" properties (eg. URL or Identifier properties, both path of IThing) to know whether they should merge.

Because at its core, every property can have one or many values, merging them isn't too difficult. We would probably want a way to provide custom type comparers though as people do extend schema types etc let alone for other examples (eg. maybe URLs are considered equal even if their query string is different).

Clearing Empty
This fits somewhat closely with the Merging one - being able to remove Schema objects with no values specifically defined would be a useful feature. This can be as simple as check the count on each property.

Value Type Promotion
This one is useful, especially prior to running any Merge logic - in many cases, a value can be represented by multiple types for a single property. For example, an image might be a Uri or it might be a ImageObject.

Value promotion would be a way to convert a new Uri() to an new ImageObject { Uri = theUri } as these two are functionally equivalent. In a form like this, it can make dealing with external data (where you don't control the consistency of it) a lot easier to deal with.

There may even be use for going the other way too (converting an ImageObject with only the Uri property set back to just a Uri type) - maybe for making the most size-efficient JSON - though personally I haven't needed to.


These features are useful for working with the Schema objects in different ways though at the same time, might not actually need to belong to the main library. This might be a good utility library like Schema.NET.Utilities or something - I'll leave that up to you.

Right now, I've got pretty "crappy" implementations of these in my current project (something that heavily relies on Schema.NET) and I don't know when I will be able to get better versions as a PR, this issue is to initially raise this as a point of discussion and get your thoughts on it and whether it would be worth having in the main library or in a utility library.

@Turnerj Turnerj added the enhancement Issues describing an enhancement or pull requests adding an enhancement. label May 3, 2020
@RehanSaeed
Copy link
Owner

Do you have some example applications for some of these features?

  • Merging - Sounds like a cool idea but not sure of the use case.
  • Clearing - Does this mean removing an item from a collection if all it's members are defaults? This sounds like a good idea. It's essentially minifying the final JSON-LD that you mgiht output. Seems related to Value Type Demotion.
    Value Type Promotion - I'm not sure how this would work. There could be any number of types that you might want to promote to.
  • Value Type Demotion - As you say, this sounds like it would be useful as a form of minification.

I guess you'd probably also want to think about cloning too, although again I'm not sure about the use case.

@Turnerj
Copy link
Collaborator Author

Turnerj commented May 4, 2020

So I'm working on a service I've called BrandVantage which one of its core features is to convert any web page to a Schema.org WebPage as it is already an interoperable standard. I generate the objects by parsing details out of the page (eg. JSON-LD, Microdata) combined with other heuristics to get essentially a computer-readable page object.

Merging - Sounds like a cool idea but not sure of the use case.

As I convert data from a variety of sources into Schema.org objects, merging them together is imperative to avoid duplicate data - having two ImageObject items with the same Uri is redundant. An additional benefit of merging data also leads to the ability to minify data too.

Clearing - Does this mean removing an item from a collection if all it's members are defaults?

Yep - basically perform a depth-first search of objects with default properties. Another useful thing for both minifying and merging.

Value Type Promotion - I'm not sure how this would work. There could be any number of types that you might want to promote to.

That is very true though its really a case-by-case basis. If I have a Values2<string, IThing>, its unlikely possible to promote the value as there is little understanding about the context. If I had a Values2<string, Uri>, it might be natural to promote (up-cast?) from string to Uri if compatible.

Going one step further, if we had a Values2<Uri, ImageObject>, this property is likely meant to represent an image anyway so promoting/up-casting to ImageObject with the Uri property set wouldn't be a big leap.

This process (or the reverse, Value Type Demotion) are helpful for merging - having a Values2<Uri, ImageObject> with the values of new Uri("http://example.org/image.png") and new ImageObject { Uri = new Uri("http://example.org/image.png") } would be a pain to handle merging for.

My reasons for promotion/up-casting is to allow more consistency in the data. Having some images be just a Uri and some be an ImageObject in the same collection can be a little harder to work with even though it is functionally allowed.

Value Type Demotion - As you say, this sounds like it would be useful as a form of minification.

Yeah definitely. Expressing an ImageObject in JSON is something like { type: "ImageObject", uri: "http://example.org/image.png" } but could equally be "http://example.org/image.png" as no other properties are set.


Really each of these is just different ways to manipulate the objects. At the core, we'd need a good way to traverse the objects depth-first and a way to update a Values or OneOrMany property with as little allocations as possible.

Regarding cloning, while it would likely be useful for the same types of reasons, I personally haven't had the need to clone objects yet.

@RehanSaeed RehanSaeed added the help wanted Help wanted from the community. label Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issues describing an enhancement or pull requests adding an enhancement. help wanted Help wanted from the community.
Projects
None yet
Development

No branches or pull requests

2 participants