-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make UUri specification more formal #121
Make UUri specification more formal #121
Conversation
Here's my first shot at a UUri spec formalization. One thing that still bothers me: This shouldn't necessarily be an issue because most characters used for the logical name will consist of ASCII characters only and those are encoded into a single byte (using UTF8). However, IMHO we should properly handle cases in which the authority would be encoded into more than 255 bytes. Given that the Long and Short Format do not put any constraints on the authority length, I wonder if we really need to constrain it in the Micro Format. If we need/want to set a limit, then IMHO we should define this constraint on the Object Model itself. We should then probably only allow ASCII characters in the authority name so that we know upfront, how many bytes it would be encoded into ... |
@stevenhartley @PLeVasseur @tamarafischer @AnotherDaniel @evshary any feedback would be welcome |
@sophokles73 we need to follow RFC3968 which means it cannot be any form of UTF-8 character only those characters listed in Appendix A, a collection of ABNF for URI. |
Well, for the long and short form URIs that is done by means of percent encoding any illegal characters. However, that also increases the number of ASCII characters in the URI. For the micro form serialization I do not see why we would be bound by RFC3986. We are not producing a (textual) URI but a custom binary representation and as long as we can recreate the original UUri from it, we should be safe, shouldn't we? Another approach, of course, would be to restrict UAuthortity.name to only contain ASCII characters in the first place, if that is what you mean. |
Yes, lets restrict to ASCII characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comments. Much better than before for sure!
basics/uri.adoc
Outdated
|
||
[source] | ||
A UUri can be represented as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A UUri can be represented as | |
A UUri can be represented as: |
basics/uri.adoc
Outdated
[source] | ||
A UUri can be represented as | ||
|
||
* an instantiation of the <<_data_model,UUri Data Model>> in a supported programming language, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* an instantiation of the <<_data_model,UUri Data Model>> in a supported programming language, | |
* An instantiation of the <<UUri Data Model>> in a supported programming language |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to refer to the section, not the UML class diagram here ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should refer to the section with that title unless you named the title and the diagram the same. The reason why I proposed the change is that the link did not work for me when I rendered it in the PR.
|`[1, 1, -128, 0, 0, 0, 3, 0, -64, -88, 1, 100]` | ||
[source,java] | ||
---- | ||
public class UUri { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UUri is defined in proto and the classes are generated to which we cannot extend. I recommend we do not show coding examples in this spec but state that serializers and deserializers and validators are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that might be true for the tooling you use for generating Java code. For Rust, it is very much possible to add such code to the generated types. In general, I wonder if this specification should be based on the features/shortcomings of proto3 and the language specific tooling, or if we want/need to define a model that serves the purpose of uProtocol best. In that context, I have to admit that I am very unhappy with the fact that the Object Model uses UInt32 types when the domain actually requires UInt16 for identifiers and UInt8 for version_{major,minor}.
FMPOV we should define in the spec what we need and consider right and then point out, that the proto3 definition's sole purpose is to supports serialization and deserialization of a (valid) instance of the object model, but nothing more.
@stevenhartley I have updated according to your feedback but also have added some questions to the comments ... |
I answered your questions this morning. |
@stevenhartley It looks like some of them slipped through ... maybe you haven't seen them because they have been collapsed by GitHub? |
@stevenhartley I just had a little chat with @AnotherDaniel about his problems using UUris from within his uSubscription implementation which are mainly related to the duality of identifiers (name vs. id) :-) Would it make sense to think a little more into that direction or is it totally out of the question from your point of view? |
No we cannot do this (leave it up to the transport layer). The source of truth for service ids is the proto and is known to the application layer code or uDiscovery (not the transport layer). UAuthority to ip address can be done using linux resolver or static IP address table. |
Hm, I hear what you're saying. So pushing ahead with uDiscovery, and making that the source of truth for this kind of lookup - based in the proto definitions - would that be an option? I really don't like this information-duality between names and ids in the UURi, also the fact that we are redefining/extending what an uri is... feels off. Would much rather have uris just be uris, and have a well defined, separate way to translate to id representation and back (like url - DNS - ip) |
I'm open to other suggestions and agree when we only had long form URI it was much simpler. I haven't been able to think of a different solution. |
This might become too big a discussion for this PR - take this out into a dedicated issue maybe? I'm asking this from my usual clueless point of view - but my thinking would be: shouldn't the up-client/boundary-ustreamers adapters be able to perform name-to-id-to-name translation for the messages that flow by, with the relevant information retrieved from uDiscovery (DNS-style) services? These uDiscovery entities, within a certain scope (i.e. vehicle), might then address the question of how to efficiently manage their translation tables as some form of distributed configuration, that can be managed as a separate problem to solve (so these tables might be just built from proto definitions, as default/semi-static config, but obviously some form of updatability might be desirable in the future)... |
The following works to address #121 by making the contract explicit and to differentiate the old LongForm URI and short form URI.
Sorry for disrupting the discussion. I also like the restriction of UAuthority.name, but I'm not sure what the range of ASCII characters here is. Does it only limit to use alphabet and number, or it can also use something like |
In the PR I am currently restricting the authority to be a host following the corresponding production in https://datatracker.ietf.org/doc/html/rfc3986#appendix-A In particular, this also includes characters from the sub-delim production which also includes
Yes, FMPOV that is the plan. |
OK, then I still need to do some transformation while generating Zenoh key. At least,
Then, no matter MQTT topic or Zenoh key should transform the authority to either lowercase or uppercase, since they are case-sensitive. Is there any reason why we stick to RFC3986 standard here? If not, how about defining our own UAuthority for our own purpose? |
IMHO the biggest advantage is to be able to use tooling that is available for RFC3986 URIs, e.g. for parsing, comparing etc. Most programming languages have support for URIs in some way. That said, as long as we make sure that a UUri is-a RFC3986 URI, we should be on the safe side. This should indeed allow us to e.g. restrict the supported characters in the host production if we feel that this would be beneficial. Do you have a concrete proposal for that? |
Totally agree with you.
No, but I'm thinking perhaps using unserved here in UAuthority is enough.
Also, limiting the using lowercase (or uppercase) alphabets here might be simpler. |
The following adds to #121 to simplify the datamodel for UUri object.
@sophokles73 (et. al.), I want to summarize conversations over email so we can conclude on this topic. I really don't think we will get a single UUri definition that will map perfectly to all transports so need to to compromise, taking into consideration the ease of use, $ for transmission, and other factors.
If we settle on above, there is no more need for isResolved(), isLong(0, isShort(), isMicro() and all the other validator logic that is overly complicated. Serializers are simplified back to a single toString() and fromString() and the string format is now:
I've captured these changes in #126 . The reason for keeping UAuthority, UEntity, and UResource is to minimize the amount of changes in all the existing code and not to re-invent UUri completely. |
So far, the minor version had not been part of any of UUri's serializations. Just want to make sure that we really need it in there. And if we do, I guess the inclusion of minor version should be dependent on the presence of major version
or be put into a separate path segment
However, I'd prefer the former, if we really need minor version in it ... |
ed93082
to
a482912
Compare
@stevenhartley I have adapted the UUri spec according to the discussed simplifications of the UUri data model. Some things I want to bring to your attention:
|
I'll take a look. I started to code these things on the 1.5.8 branch of up-java and the serializers and deserializers were drastically simplified as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sophokles73 only minor comments left but looking much better than before!
basics/uri.drawio.svg
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed the white background so it is difficult to see the image, please add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this makes it harder to see in vscode (with a dark theme/background). However, when viewing on GitHub this is not an issue, right? The rfc3986.png also has a transparent background so I thought we might want to do the same for the URI diagram ...
| Returns true if the Uri part contains the required ids to serialized to Micro format and the fields of the Uri can fit within the specified number of bits and bytes. Both sets of details can be obtained under <<Micro URIs>> | ||
[source,ocl] | ||
---- | ||
inv: authority_name.length() <= 128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did this requirement come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had been discussing that we wanted to limit the length of the authority name to some degree. I just picked 128 as a first shot. Do you still think that we should limit the length?
basics/uri.adoc
Outdated
-- | ||
A UUri's `ue_id` property value determines the type and instance of service being referred to. + | ||
The value's _most_ significant 16 bits represent the _service instance_. + | ||
The value's _least_ significant 16 bits represent the _service type_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
identifier and not type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually talk about instances of a certain type, don't we? A service identifier to me suggests that we are talking about a particular instance already, instead of its type.
basics/uri.adoc
Outdated
[.specitem,oft-sid="dsn~entity-id~1"] | ||
-- | ||
A UUri's `ue_id` property value determines the type and instance of service being referred to. + | ||
The value's _most_ significant 16 bits represent the _service instance_. + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put a note that a service instance of 0 means "don't care".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added that to the section on pattern matching ...
basics/uri.adoc
Outdated
|Format | Value | ||
[.specitem,oft-sid="dsn~pattern-matching~1",oft-needs="impl,utest"] | ||
-- | ||
A UUri that is used as a pattern to match other UUris against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicated from the sentence above
basics/uri.adoc
Outdated
|Long | ||
|`//vcu.veh.gm.com/core.usubscription/3/SubscriptionChange#Update` | ||
* *MAY* have its `authority_name` set to the `*` (`U+002A`, Asterisk) character in order to match _any_ (including _no_) authority, | ||
* *MAY* have the _service type_ part of its `ue_id` set to `0xFFFF` in order to match _any_ type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we need to call it service number to differentiate the instance id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean, replace service type with service number or replace service instance with service number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reading some/ip spec for a little inspiration, they have service_id and service_instance_id, no where do we refer to service_id as type hence why I wanted to keep the id or number but not type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then let's stick to service id and service instance id. I'll update accordingly ...
basics/uri.adoc
Outdated
|`//vcu.veh.gm.com/core.usubscription/3/SubscriptionChange#Update` | ||
* *MAY* have its `authority_name` set to the `*` (`U+002A`, Asterisk) character in order to match _any_ (including _no_) authority, | ||
* *MAY* have the _service type_ part of its `ue_id` set to `0xFFFF` in order to match _any_ type, | ||
* *MAY* have the _service instance_ part of its `ue_id` set to `0x0000` in order to match _any_ instance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this!! better than what I was thinking and coding in Java...
@stevenhartley I have pushed changes and amended your comments. Would you mind taking another look? |
@sophokles73 not sure if my comment got lost but the diagram needs to have a white background (it was removed) otherwise I cannot view it when using dark mode in github. |
@stevenhartley I have added the white background and updated the terminology used for ue_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @sophokles73 !
a4d2d55
to
cf383b7
Compare
@stevenhartley any idea why the |
It builds the proto to make sure there are no issues with syntax. |
cf383b7
to
bddec77
Compare
@stevenhartley this PR seems to be stuck, I have tried to push some changes but that doesn't seem to resolve the situation. I will now close this PR and create a new one instead ... |
Reopening to test a theory ... |
The UUri specification has been lacking many details that are relevant for correctly implementing the data model in client libraries. A lot of the specification's textual content has been transformed into formal requirements based on invariants and predicates expressed using the Object Constraint Language. Last but not least, the content model of UUri has been drastically simplified, thus removing all ambiguity around serialization and validation. This should make implementation of the object model much easier and improve general interoperability.
bddec77
to
13989bd
Compare
This addresses #115 and other issues.
The UUri specification has been lacking many details that are relevant
for correctly implementing the data model in client libraries.
A lot of the specification's textual content has been transformed into
formal requirements based on invariants and predicates expressed using
the Object Constraint Language.
Last but not least, the 'id' and 'ip' properties have been removed from
UAuthority's data model because they did not really serve a specific
pupose beyond the semantics of the 'name' property.