Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revision Added as a Top-Level Member of Resource #824

Open
wants to merge 2 commits into
base: gh-pages
Choose a base branch
from
Open

Revision Added as a Top-Level Member of Resource #824

wants to merge 2 commits into from

Conversation

scott-mcdonald
Copy link
Contributor

…0 for versioning of resources.

Signed-off-by: Scott McDonald smcdonald@movietickets.com

@cziegenberg
Copy link
Contributor

I'd say that the client MUST send the revision for updating a resource if sent with the GET.

The server uses the revision to ensure that updates are send for the right resource revision, so IMHO it's required in all following updates - otherwise it doesn't make sense at all to use revisions.

Anither point is what's the content of the 409 response - the current or a merged version of the resource?

@scott-mcdonald
Copy link
Contributor Author

@ziege I agree with you in that originally I wanted to say the something like: If a revision is supplied in a GET request, then revision MUST be passed in a PATCH or PUT request. However @tkellen pointed out the following:

we cannot say that PATCH or PUT MUST include revision--it would be backwards incompatible with 1.0 of the spec. This change must be wholly additive.

I agree if the next version is 1.1 it should be backwards compatible with 1.0 so I changed MUST to either MAY or SHOULD. If and when this standard goes to 2.0 then we should change the above to MUST.

To what the content of the response should be for a 409 Conflict I will leave open to discussion. Currently the specification states the following for the 409 Conflict:

A server SHOULD include error details and provide enough information to recognize the source of the conflict.

I read that as the specification is agnostic on what to return as a response.

Sidebar: My particular implementation detail for this scenario is if the server returns a 409 Conflict, the client must then do a new GET request, reapply modifications, and try the PATCH or PUT request again up to a maximum number of attempts.

Another sidebar: Can someone explain the following line in the specification in the Responses section for updating resources:

Quote: A server MUST return 409 Conflict when processing a PATCH request in which the resource object's type and id do not match the server's endpoint.

I don't understand that statement. The type and id are just strings and if they do not exist in the system shouldn't the response be a 404 Not Found. Or did I misinterpret?

@ethanresnick
Copy link
Member

I agree if the next version is 1.1 it should be backwards compatible with 1.0 so I changed MUST to either MAY or SHOULD. If and when this standard goes to 2.0 then we should change the above to MUST.

By my reading, a MUST here actually would be backwards compatible, because it's a MUST that's conditional on getting a revision key in the GET, which no 1.0 servers will return. But I may be over looking something…

I have a different concern with the current wording of the PR, though, which is that I don't think it's accurate to say that revision is part of the identifying characteristics of a resource in the way (type, id) are. (Therefore, among other things, I wouldn't put the description of revision in the identity section of the spec.)

Consider a request to a relationship endpoint. If revision were part of resource identity, I'd expect the client return something like the below:

GET /people/1/relationships/photos

{
  "data": [{
    "type": "photos", "id": "4", "revision": "abcd"
  }]
}

Likewise, I'd imagine that a DELETE request would have the same body as the above, with revision included. But I don't think that's what we want, even if the targeted resources are revisioned. Because the resources are in or should be removed from the relationship regardless of their present revision, right? Likewise, it doesn't seem like we want to expose a mechanism to GET the various revisions of a resource, or DELETE a particular revision.

So, really, the identifier is still (type, id) and revision is just an attribute summarizing the state at the time its retrieved. To make the HTTP analogy: Etag isn't part of resource identity; identity is only based on the URI, and Etag only matter for updates/caching.

I also still wonder if there's a way to make Etag and revision match up. For example, we could say that every revisioned resource's self uri has to include no other resources by default (or at least support a ?include= option to remove any default includes) and has to give empty top-level meta. That might be overkill...but it at least seems possible.

@dgeb
Copy link
Member

dgeb commented Jul 30, 2015

I don't think it's accurate to say that revision is part of the identifying characteristics of a resource in the way (type, id) are. (Therefore, among other things, I wouldn't put the description of revision in the identity section of the spec.)

I agree with @ethanresnick here and share the concerns he's raised.

If we allow revision in resource identifier objects that represent relationships, then it MUST represent a revision of the relationship itself as opposed to a revision of the related resource. I believe the exception is when a related resource represents a relationship (say, a "membership" resource).

There is a lot of subtlety and potential for confusion here.

@scott-mcdonald
Copy link
Contributor Author

So rethinking having revision defined as part of the resource identifier object:

Are they the same resource? Yes and No. They are the same resource but at different versions. For identity purposes though it is just type and id isn't it.

The final affirmation for me is for resource linkage purposes it doesn't make sense to have revision which is why I did not add revision in various resource examples in the specification.

Therefore I have come to the same conclusion as you guys and to quote @dgeb:

There is a lot of subtlety and potential for confusion here.

Do not want that now do we. Therefore revision should be primarily documented in the resource object section not the resource identifier object section.

@dgeb What was your take on the verbiage of if the optional revision is supplied in a GET request, it MUST be passed in a PATCH or PUT request that was talked about in this thread. I prefer MUST, I like @ethanresnick thought process that this does not break 1.0 backwards compatible because it is optional in the GET request.

@tkellen
Copy link
Member

tkellen commented Jul 30, 2015

After further reflection, I'm cautiously supportive of the MUST requirement for including revision if the server is providing revisions. I agree with @dgeb that there is a lot of opportunity for confusion here. I believe we should move the non-normative explanations to the FAQ. Alternatively, I think we should have more normative statements around how revision should be used in conjunction with etags.

@dgeb
Copy link
Member

dgeb commented Jul 30, 2015

@dgeb What was your take on the verbiage of if the optional revision is supplied in a GET request, it MUST be passed in a PATCH or PUT request that was talked about in this thread. I prefer MUST, I like @ethanresnick thought process that this does not break 1.0 backwards compatible because it is optional in the GET request.

I actually favor a MAY for including revision with PATCH. First of all, I don't think a GET should be a pre-requisite for PATCH. Also, some clients will want to force a change (say an update to an attribute) regardless of the resource's revision. By forcing the inclusion of revision, we'd also be forcing servers to reject PATCHes if that field doesn't match.

On that same note, the one MUST I would say should be in place is that the server MUST reject a PATCH request that includes revision if it doesn't match the current revision of the resource.

@cziegenberg
Copy link
Contributor

First of all, I don't think a GET should be a pre-requisite for PATCH

Isn't there always a GET before you can PATCH anything?

Also, some clients will want to force a change (say an update to an attribute) regardless of the resource's revision

I'd never allow a client to bypass the revision check. Do you have an example when this could be a requirement?

On that same note, the one MUST I would say should be in place is that the server MUST reject a PATCH request that includes revision if it doesn't match the current revision of the resource.

Instead of a MUST I'd allow the server to decide if the request is acceptable or not although the revision doesn't match - this could be useful if you send changes that do not conflict with changes made in the meantime. A conflict doens't necessarily mean an update of the same property, the server can also define other conditions (e.g. properties that depend on each other and therefore cannot be merged).

@dgeb
Copy link
Member

dgeb commented Jul 30, 2015

I think our differences come down to the question of what is expected of a server when passed a revision that isn't up to date. My assumption has been that the server should always outright reject such a request because the very point of passing in the revision is to ensure that the client and server are working with the same version of the resource. If that's not true, then it implies a sophisticated server that maintains multiple versions of each resource and is able to discern whether there's a conflict based upon a review of the history of changes between versions.

@scott-mcdonald
Copy link
Contributor Author

I actually favor a MAY for including revision with PATCH. First of all, I don't think a GET should be a pre-requisite for PATCH. Also, some clients will want to force a change (say an update to an attribute) regardless of the resource's revision. By forcing the inclusion of revision, we'd also be forcing servers to reject PATCHes if that field doesn't match.

I want to cautiously disagree with this statement. If a server is returning revision values with resources on GET requests it is for a reason and purpose by the server meaning the client should not be ignoring revision on PATCH requests and hence why I prefer MUST be passed on PATCH if supplied on GET. This is the behavior I believe most developers would expect. But am I being too conservative or heavy handed in this thinking. What if a web server wants to be more liberal and support a scenario where it allows an update to always succeed, aka a force update or overwrite feature like you mentioned. I have to admit this is more in line with the Robustness principle "Be conservative in what you do, be liberal in what you accept from others" so I am torn now. Would be interested in what others think.

I currently have text basically saying the following:

If a server supplies a revision member in a GET request, a client SHOULD pass the same revision member in a PATCH request when updating the resource. SHOULD is in between MAY and MUST. Maybe this is exactly where it needs to be...

On that same note, the one MUST I would say should be in place is that the server MUST reject a PATCH request that includes revision if it doesn't match the current revision of the resource.

Excellent point and should be added to the specification.

@tkellen To your point about etag and revision. I purposely left this out because they are not the same thing: etag is a version of the document as a whole, revision is a version of a specific, individual resource. To @dgeb point in a previous email thread, even if the document is just a single resource, with Sparse Fieldsets documents with the same resource and version but with different representations due to different fields being included would have different etags.

@scott-mcdonald
Copy link
Contributor Author

Added a 2nd draft, to summarize:

  • Removed revision as part of a resource identifier object altogether. Revision is now part of resource object.
  • Added a new child revision section under the resource object parent section with revision description there.
  • Changed the normative text on when to pass revision to basically the following: If a server supplies a revision value on a GET request, a client MUST pass the same revision value in a PATCH request on update.
    • I changed to MUST for now because I believe it is what most developers would expect and want. But this can be changed based on further debate if needed, or changed by the owners when adding the pull-request.
  • Added/Clarified that on updating resources on revision mismatch then the server MUST reject the update with a 409 Conflict.
  • I did not add any statements around how revision and etag can be used in conjunction for now. I suppose this can be added later but thinking the specification should remain agnostic on this topic. Not sure what to put for this.

@@ -1453,6 +1470,9 @@ constraints (such as a uniqueness constraint on a property other than `id`).
A server **MUST** return `409 Conflict` when processing a `PATCH` request in
which the resource object's `type` and `id` do not match the server's endpoint.

A server **MUST** return `409 Conflict` when processing a `PATCH` request in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too prescriptive. A server may be able to resolve a conflict and accept it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this normative text is redundant with the text on line 221.

@tkellen
Copy link
Member

tkellen commented Jul 31, 2015

I think our differences come down to the question of what is expected of a server when passed a revision that isn't up to date. My assumption has been that the server should always outright reject such a request because the very point of passing in the revision is to ensure that the client and server are working with the same version of the resource. If that's not true, then it implies a sophisticated server that maintains multiple versions of each resource and is able to discern whether there's a conflict based upon a review of the history of changes between versions.

I agree that most servers will likely outright reject mis-matching revisions, but I don't think we should normatively close the door on the possibility of a sophisticated server using JSON-API. For example, databases like datomic do have a full revision history of every change.

@scott-mcdonald
Copy link
Contributor Author

@tkellen Thanks and excellent points, will add to a 3rd draft early next week based on other comments.

@tkellen
Copy link
Member

tkellen commented Jul 31, 2015

Sure thing @scott-mcdonald! I'm super excited to see this land and to implement it for endpoints.

@tkellen
Copy link
Member

tkellen commented Jul 31, 2015

One other thing to note, @bobholt assembled a json-api representation of all the normative statements in the spec (to be used for generating test suites). I'm working on making it possible to automatically generate that file, but it is currently still a manual process. If you could include any new normative statements in this file for the next revision that would be awesome:

@scott-mcdonald
Copy link
Contributor Author

Added a 3rd draft based on feedback from @tkellen in this thread.

@@ -1287,6 +1309,8 @@ primary data, the same request URL can be used for updates.

The `PATCH` request **MUST** include a single resource object as primary data.
The resource object **MUST** contain `type` and `id` members.
The resource object **MUST** contain `revision` member if the server supplied
the `revision` member in a `GET` request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't make a MUST level requirement of clients based on the server implementation. If a 1.1 server includes revision, then a 1.0 client should still work without including revision.

@scott-mcdonald
Copy link
Contributor Author

Added a 4th draft based on feedback from @dgeb. Basically changed MUST to SHOULD so the specification is backwards compatible when a 1.0 client calls a 1.1 or greater server. Used SHOULD instead of MAY to stress the importance of properly handling updates when revision is involved.

@tkellen
Copy link
Member

tkellen commented Aug 19, 2015

This seems great @scott-mcdonald! Unfortunately, we can't land this until we make some infrastructural changes to the site to support multiple versions. This would be a part of 1.1-alpha. Also, can you please add your new normative statements to this file?

Thanks!

@ethanresnick
Copy link
Member

How would people feel about waiting until we have extensions figured out before landing this? I'm thinking that, if we make this an extension, we might be able to turn some of those SHOULDs into MUSTs or otherwise make stronger guarantees in way that's backwards compatible than we can if we try to add this to the base spec.

@tkellen
Copy link
Member

tkellen commented Aug 20, 2015

As long as you lead the charge to getting extensions figured out I am game
to consider this. Both @dgeb and I have been too busy to really give
JSON-API the attention it deserves lately.
On Aug 19, 2015 8:14 PM, "Ethan" notifications@github.com wrote:

How would people feel about waiting until we have extensions figured out
before landing this? I'm thinking that, if we make this an extension, we
might be able to turn some of those SHOULDs into MUSTs or otherwise make
stronger guarantees in way that's backwards compatible than we can if we
try to add this to the base spec.


Reply to this email directly or view it on GitHub
#824 (comment).

@ethanresnick
Copy link
Member

@tkellen Works for me!

@scott-mcdonald
Copy link
Contributor Author

Gentlemen,

I like @ethanresnick train of thought with respect to the SHOULD really being MUST for strong guarantees in terms of optimistic concurrency control at the resource level. This is what most developers would think is the "normative" use case, IMHO.

Again IMHO, I feel it is a stretch to think it should be an extension though. To me an extension is something a client would negotiate FOR because it is looking for an additional feature but why would a client negotiate for revision at the resource level; this is something a service would want to impose without negotiation. The Bulk extension makes perfect sense as an extension but resource version does not feel right. I guess I would have to see how this works before making a final judgment.

From my perspective resource version is so fundamental it should be part of the base specification. The issue is with the backwards compatibility with 1.0 forever edict. By the way does that mean if the specification was at 4.2 it would always have to be backwards compatible to 1.0?

Again IMHO, if you increment the major version number it is no longer backwards to the previous version, this is what most developers would think is the "normative" use case I believe. Under this pretext I would do the following if I was an owner: Keep the specification as it is currently written for 1.1 so it is backwards compatible to 1.0 with the SHOULD instead of the MUST. But if the specification ever goes to 2.0 change the SHOULD to MUST, etc.

The 1.0 specification has the "jsonapi" object that has the version in question (1.0, 1.1, 2.0) so I don't see this as a big deal. But I am sure you gentlemen have already discussed this forever backwards compatibility to 1.0 "ad nauseam" so the previous statements are probably not going to fly...

@tkellen as far as if this being a part of 1.1 alpha and being delayed I am in no big hurry so take your time. I currently have "version" in the attributes of a resource where needed for optimistic concurrency control, etc. When this did "land" was going to refactor to "revision" at the top level of resources, etc.

BTW, I do see "caching" at the resource level leveraging revision being some kind of extension?

@tkellen
Copy link
Member

tkellen commented Aug 20, 2015

A quick clarifying response from a skim of your note @scott-mcdonald.

JSON-API will never bump major. If it does, it becomes a new project with a new media type. Backwards compatibility is a contract we refuse to break.

@tkellen
Copy link
Member

tkellen commented Aug 20, 2015

Upon further consideration, I do agree with @scott-mcdonald that this should be a part of the base specification.

@ethanresnick
Copy link
Member

@scott-mcdonald What you said makes a lot of sense, so I'll revise my position too: I'm not necessarily against specifying this feature (or most of it) in the base spec. What was really motivating my earlier comment was a feeling that I just want to make sure we don't rush this, which I was worried might happen if we try to work on it concurrently with our focus on extensions and #795, which are the two issues marked for 1.1. As @dgeb and @tkellen have also said: there's a fair amount of subtlety here.

For example, here are some outstanding things I'm wondering about:

  1. The exact semantics of the "revision" member. Namely, if a resource’s attribute is updated from value A to B, and then set back to A, are we saying that its revision: a) MUST be different than before the value was originally changed to B; b) MAY be different; or MUST be the same. (Note that the last option would be incompatible with an incrementing row version integer.) Which approach does Etag take? What are the relative pros/cons?
  2. Whether "revision" MUST be included in a PATCH (which would always require a GET and prevent forced updates) or merely SHOULD, or whether this is something that the server has discretion over. I don't have a clear feeling one way or the other from our discussion thus far and, again, I haven't had much time to think about it.
  3. Whether, as @ziege alluded to, we want to/can define this feature such that sophisticated servers have more options than simply rejecting requests that don't match the "revision". This seems like something we should enable if we can do so in an additive way (i.e. simple servers can still just reject the request), but I'm not sure what it would require.
  4. If we delay the resource-level caching stuff for an extension, as you suggest, what do we need to work out in advance to ensure that the extension system will actually support adding in those constraints later.

If we can work all these details out easily, then awesome; I am excited to see this land! But I do want to make sure we're able to work them out first.

@scott-mcdonald
Copy link
Contributor Author

Hi @ethanresnick,

Understood about not wanting to rush. IMHO, I feel this topic has been highly vetted to this point reflecting in high quality of what has been produced to date (4 drafts for example) which is a good thing. Before answering your specific points let me provide an Abstract from my point of view as a foundation:

Abstract

The primary use case for a revision member is for optimistic concurrency control purposes. The revision member is an opaque identifier and is a server-side implementation detail on how it is generated. If a server is providing the revision member in a read it should be passed back in any update. On updates, the server compares the client supplied revision to the actual current revision and if they match the update proceeds. If they do not match, the server can try to resolve the difference but if it can not then return a 409-Conflict. On a 409-Conflict a client should get the latest version of the resource, reapply updates, and try the update again if needed.

Your Questions

The exact semantics of the "revision" member. Namely, if a resource’s attribute is updated from value A to B, and then set back to A, are we saying that its revision: a) MUST be different than before the value was originally changed to B; b) MAY be different; or MUST be the same. (Note that the last option would be incompatible with an incrementing row version integer.) Which approach does Etag take? What are the relative pros/cons?

The revision member is an opaque identifier and is a server-side implementation detail. Therefore the base specification SHOULD NOT state any implementation details on the semantics of the value itself. Fundamentally all a client and/or service wants to know is if two resources with the same type/id are the same version or not by comparing revision for equality. Etag values are semantically exactly what I have described here except applies for the response as a whole, see https://en.wikipedia.org/wiki/HTTP_ETag as an example.

What I like about this approach is in the base specification revision is defined as an opaque identifier. This opens up further refinement in all kinds of extensions that could go further on the actual semantics of the revision value, i.e. resource level caching.

Whether "revision" MUST be included in a PATCH (which would always require a GET and prevent forced updates) or merely SHOULD, or whether this is something that the server has discretion over. I don't have a clear feeling one way or the other from our discussion thus far and, again, I haven't had much time to think about it.

This is up for debate. I just felt if a service is supplying revision in a GET it expects it in a PATCH otherwise why supply it all. But for backwards compatibility with 1.0 I set this to SHOULD.

Whether, as @ziege alluded to, we want to/can define this feature such that sophisticated servers have more options than simply rejecting requests that don't match the "revision". This seems like something we should enable if we can do so in an additive way (i.e. simple servers can still just reject the request), but I'm not sure what it would require.

Agreed and this is why I used the following verbiage in the Update section (please note the unable to resolve text):

A server SHOULD return 409 Conflict when processing a PATCH request in
which the resource object's revision does not match the server's endpoint
and the server was unable to resolve the revision mismatch for updating
resources with the same type and id but mismatched revision members.

If we delay the resource-level caching stuff for an extension, as you suggest, what do we need to work out in advance to ensure that the extension system will actually support adding in those constraints later.

I have not given this much thought and quite frankly should be it's own ISSUE and deployable item. I do know that revision is going to play a key role just like Etag does in HTTP level caching.

Summary

IMHO, I believe how revision is currently written primarily for optimistic concurrency control is at the right level for the base specification. I believe this maximizes flexibility for later extensions as in the base specification it is defined as an opaque identifier and we are not being too restrictive.

I would release it as-is with it currently being on the more minimal side; let the community digest it and see what feedback, issues come up and then refactor as needed.

@ethanresnick
Copy link
Member

Hi @scott-mcdonald,

I've been thinking more about this recently and would like to get it merged as soon as possible. I agree with your answers to my second and third questions, and I think there are only three things left to settle:

  1. I agree completely that the spec should not state any revision implementation details. But it should define the meaning of the revision member very explicitly, and this meaning is what I was trying to nail down with my first question. From your answer, it sounds like you're advocating for option b, and I agree with you that that makes the most sense. However, I think your PR as worded could be read to inadvertently endorse option a. In particular, it says that "If the resource is updated, a new and different revision is assigned." To me, that means that, when the resource's attribute starts at value A, its revision might be "1". Then, when the attribute's value is set to B, the revision could be set to "2". Finally, when the attribute is set back to A, the server must set the revision to something "new and different", so it sets it to "3"—whereas option b implies that it should be allowed to set it back to "1".

    So, if we're going with option b, we need to update the PR to reflect that. Preferably, the phrasing would give explicit, almost if–then, constraints. For example, option b could be captured precisely (but far too verbosely; this is just a rough example) by saying: "At all points in time at which a given JSON API resource has revision value X, the resource's fields must take on the same set of values, Y." (Note that this doesn't go the other way and specify that, whenever the field values are the same, the revision must match; that would be option c, which is probably too rigid.) Trying to give strict constraints, as in my provided wording, is probably impractical though, because, for example, it could inadvertently prevent implementations from using a hash function. (Hash collisions would violate my constraint above.) So we need something a bit looser; I'd suggest taking a look at how the HTTP RFCs word this.

  2. The issue of trying to nail down revision's meaning above brought up a new question for me: how does resource-level meta fit into this picture? In the constraint I gave above, a change to meta wouldn't imply a change in revision. Is that what we want? I think this depends on whether users are supposed to be setting meta at all...which the current spec is pretty vague on...or what meta might hold. I'd love for @tkellen, @dgeb, and @steveklabnik to chime in here. Once we figure out meta semantics more clearly, I'll have a stronger position on the fourth question.

  3. Lastly, there are some other things to consider from HTTP's implementation.

    • HTTP allows the client to send multiple etags in the same If-(None-)Match header. Do you think this is something we should support for revision? Personally, I'm leaning towards no, but I can imagine one place it might be useful. I don't think it'd be useful/worth it just for conditional updates because, even in HTTP, sending multiple etags should only slightly increase the odds of a conditional update succeeding; in our case, that advanatage seems even less applicable given that we've redefined revision's semantics from "If-Match" (i.e. if the revision is exactly the one provided) to "If-Compatible-With" (i.e., if the server can resolve the conflict). However, the one place where I'd imagine it might be useful is for a conditional GET over WebSockets, where the body of the message could look like this: { "data": { "revision": ["a", "b", "c"] } }. That would allow the server to skip sending back a full representation if the current state is "a", "b", or "c", which could be a savings for resources that alternate between a small number of states (many of which the client has already seen).
    • Finally, HTTP reserves the "*" value as a wildcard in If-(None-)Match headers, which can be useful when applying a method that could create or update a resource. (E.g., PUT with If-Match: * means PUT, but only if this resource has already been created.) I think we likewise reserve "*" as a revision value, even if we don't define semantics for it now, as it could come in handy re the second question in PATCH-related clarifications #419. It could also come in handy as a precondition to a DELETE in a batch operation request, to prevent the request from totally failing if the DELETE has already occurred. Plus, reserving stuff is just always good. (In that vein, I think we should also think about limiting the characters allowed in revision values to those allowed in Etags, in case we want to define a mapping between the two. However, that gets complicated with double quotes and the weak modifier...again, let's figure that out after we address meta's semantics.)

@ethanresnick
Copy link
Member

Also, @scott-mcdonald, I apologize if it feels like I'm nitpicking; I don't mean to, and I know you've been trying to get this merged for a long time. Keep in mind, though, that once we add something to the base spec, we can never take it back. That's why I'm harping so much on making sure we have the meaning specified precisely; and why I'm trying to make sure that, as much as possible, we're complementing rather than duplicating Etags; and that we're designing the feature to give us a bit of extensibility for future use cases; etc.

@hhware
Copy link
Contributor

hhware commented Sep 17, 2015

I actually favor a MAY for including revision with PATCH. First of all, I don't think a GET should be a pre-requisite for PATCH. Also, some clients will want to force a change (say an update to an attribute) regardless of the resource's revision. By forcing the inclusion of revision, we'd also be forcing servers to reject PATCHes if that field doesn't match.

Fully support this!

An example: a client is PATCHing a user record by updating its email. If the client knows for sure that the new email is correct (e.g., because it is this client who is actually changing it), then it may not matter what the current value of email is. IMHO, it should be up to the API designer whether to allow clients to update attributes this way. The spec should not mandate pointless back-and-forth (GET before PATCH, possibly multiple times) unless it makes sense for the particular API.

How would people feel about waiting until we have extensions figured out before landing this? I'm thinking that, if we make this an extension, we might be able to turn some of those SHOULDs into MUSTs or otherwise make stronger guarantees in way that's backwards compatible than we can if we try to add this to the base spec.

Accordingly, I do not support this proposal.

My assumption has been that the server should always outright reject such a request because the very point of passing in the revision is to ensure that the client and server are working with the same version of the resource. If that's not true, then it implies a sophisticated server that maintains multiple versions of each resource and is able to discern whether there's a conflict based upon a review of the history of changes between versions.

Fully support that too!

IMHO, If we are talking about an abstract sophisticated server, simple string revision may not be enough, the server may need to define its own attribute(s) to enable its sophisticated protocol. Why hijack revision for this purpose, if it does not guarantee to cover all possible cases? IMHO, revision should be kept straightforward. I would remove the statement about "unable to resolve the revision mismatch" altogether.

The exact semantics of the "revision" member. Namely, if a resource’s attribute is updated from value A to B, and then set back to A, are we saying that its revision: a) MUST be different than before the value was originally changed to B; b) MAY be different; or MUST be the same. (Note that the last option would be incompatible with an incrementing row version integer.) Which approach does Etag take? What are the relative pros/cons?

I think MUSTs are out of question here because they assume that the server keeps track of past revisions and is able to identify the past revision with the same value of the attributes, so the only option is MAY. Hence the question: why is it even necessary for the spec to be concerned with it? Perhaps I am missing something, but what would be broken if it is left undefined and even implemented inconsistently from one endpoint to another endpoint of a given API? The only consequence I see is possible cache miss in the situation where it might have been hit.

From your answer, it sounds like you're advocating for option b, and I agree with you that that makes the most sense. However, I think your PR as worded could be read to inadvertently endorse option a. In particular, it says that "If the resource is updated, a new and different revision is assigned."

Cannot it just say "If the resource is updated, value of its revision MUST change"? This also seems to take care of the item 2 in #824 (comment)

HTTP allows the client to send multiple etags in the same If-(None-)Match header. Do you think this is something we should support for revision?

By defining a dedicated header or by allowing to supply multiple revisions in PATCH?

As for your question on meta, IMHO meta SHOULD NOT play any role in revision.

Why? IMHO, one use case for meta is to be a container for things not allowed by the spec, and it is hard to make a general statement whether such things play any role in revision. An example: #431 (comment) and #559. Again, I believe it should be up to the API designer to decide.

@@ -1453,6 +1481,14 @@ constraints (such as a uniqueness constraint on a property other than `id`).
A server **MUST** return `409 Conflict` when processing a `PATCH` request in
which the resource object's `type` and `id` do not match the server's endpoint.

A server **SHOULD** return `409 Conflict` when processing a `PATCH` request in
which the resource object's `revision` does not match the server's endpoint
and the server was *unable to resolve* the `revision` mismatch for updating
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this and the following lines, see #824 (comment)

…for versioning of resources.

Signed-off-by: Scott McDonald <scott@quantumsoft.com>
…ndex.md file.

Signed-off-by: Scott McDonald <scott@quantumsoft.com>
@scott-mcdonald
Copy link
Contributor Author

@ethanresnick Hi Ethan, I have done another revision for this story based on our exchange; when you get a chance take a look and see what you think. I noticed the introduction of "1.1" so I moved my revision member changes to the 1.1 _format/index.md file from the 1.0 version which I think is correct.

@ethanresnick
Copy link
Member

@scott-mcdonald Thanks for reworking this, Scott! On a quick glance, the current version looks much better! I'll do a careful review this weekend.

@tkellen
Copy link
Member

tkellen commented Dec 14, 2015

@ethanresnick Did you ever have a chance to review this?

@ethanresnick
Copy link
Member

@tkellen No, I didn't, and I want to apologize to @scott-mcdonald for that, since this is a very valuable contribution that's been sitting for far too long. I got about 80% through a review at one point, and found a couple problems, but then got focused on extensions. I'll try to go through this thread later this week and finish a review, so that we can get at least a beta version of this out ASAP.

@scott-mcdonald scott-mcdonald changed the title Initial draft of adding revision at top level of resource objects #60… Revision Added as a Top-Level Member of Resource Apr 3, 2016
@@ -159,6 +159,7 @@ the client and represents a new resource to be created on the server.

In addition, a resource object **MAY** contain any of these top-level members:

* `revision`: a [revision string][revision] representing a specific version of an individual resource uniquely identified by `type` and `id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could also be a recommendation as a resource meta

@raat1979
Copy link

raat1979 commented May 2, 2016

Hi
I only just started exploring the json-api, and was looking for a way to prevent parallel updates (optimistic locking) and found this discussion on revisions.
From my point of view revisions are different versions of the same document where it is possible to alter them individually. (wiki/git branches etc.)
If we are altering database records for example we would be much better of with a record timestamp or something which does not suggest there is some sort of version control.
In any case the timestamp does not match an error must be raised, for a revision this should work differently.

@scott-mcdonald scott-mcdonald mentioned this pull request Oct 11, 2017
17 tasks
bart-degreed pushed a commit to json-api-dotnet/JsonApiDotNetCore that referenced this pull request Jan 10, 2021
Because we fetch the row before update and apply changes on that, a concurrency violation is only reported when two concurrent requests update the same row in parallel. Instead, we want to produce an error if the token sent by the user does not match the stored token. To do that, we need to convince EF Core to use that as original version. That's not too hard.

Now the problem is that there is no way to send the token for relationships or deleting a resource. Skipped tests have been added to demonstrate this.

We could fetch such related rows upfront to work around that, but that kinda defeats the purpose of using concurrency tokens in the first place. It may be more correct to fail when a user is trying to add a related resource that has changed since it was fetched. This reasoning may be a bit too puristic and impractical, but at least that's how EF Core seems to handle it.

Solutions considerations:
- Add 'version' to resource identifier object, so the client can send it. The spec does not explicitly forbid adding custom fields, however 'meta' would probably be the recommended approach. Instead of extending the definition, we could encode it in the StringId.
- Once we have access to that token value, we need to somehow map that to 'the' resource property. What if there are multiple concurrency token properties on a resource? And depending on the database used, this could be typed as numeric, guid, timestamp, binary or something else.
- Given that PostgreSQL uses a number (uint xmin), should we obfuscate or even encrypt that? If the latter, we need to add an option for api developers to set the encryption key.

See also:
json-api/json-api#600
json-api/json-api#824
bart-degreed pushed a commit to json-api-dotnet/JsonApiDotNetCore that referenced this pull request Nov 8, 2021
*Update (2021-11-08): Rebased on latest changes in master branch*

Because we fetch the row before update and apply changes on that, a concurrency violation is only reported when two concurrent requests update the same row in parallel. Instead, we want to produce an error if the token sent by the user does not match the stored token. To do that, we need to convince EF Core to use that as original version. That's not too hard.

Now the problem is that there is no way to send the token for relationships or deleting a resource. Skipped tests have been added to demonstrate this.

We could fetch such related rows upfront to work around that, but that kinda defeats the purpose of using concurrency tokens in the first place. It may be more correct to fail when a user is trying to add a related resource that has changed since it was fetched. This reasoning may be a bit too puristic and impractical, but at least that's how EF Core seems to handle it.

Solutions considerations:
- Add 'version' to resource identifier object, so the client can send it. The spec does not explicitly forbid adding custom fields, however 'meta' would probably be the recommended approach. Instead of extending the definition, we could encode it in the StringId.
- Once we have access to that token value, we need to somehow map that to 'the' resource property. What if there are multiple concurrency token properties on a resource? And depending on the database used, this could be typed as numeric, guid, timestamp, binary or something else.
- Given that PostgreSQL uses a number (uint xmin), should we obfuscate or even encrypt that? If the latter, we need to add an option for api developers to set the encryption key.

See also:
json-api/json-api#600
json-api/json-api#824
bart-degreed pushed a commit to json-api-dotnet/JsonApiDotNetCore that referenced this pull request Nov 9, 2021
*Update (2021-11-08): Rebased on latest changes in master branch*

Because we fetch the row before update and apply changes on that, a concurrency violation is only reported when two concurrent requests update the same row in parallel. Instead, we want to produce an error if the token sent by the user does not match the stored token. To do that, we need to convince EF Core to use that as original version. That's not too hard.

Now the problem is that there is no way to send the token for relationships or deleting a resource. Skipped tests have been added to demonstrate this.

We could fetch such related rows upfront to work around that, but that kinda defeats the purpose of using concurrency tokens in the first place. It may be more correct to fail when a user is trying to add a related resource that has changed since it was fetched. This reasoning may be a bit too puristic and impractical, but at least that's how EF Core seems to handle it.

Solutions considerations:
- Add 'version' to resource identifier object, so the client can send it. The spec does not explicitly forbid adding custom fields, however 'meta' would probably be the recommended approach. Instead of extending the definition, we could encode it in the StringId.
- Once we have access to that token value, we need to somehow map that to 'the' resource property. What if there are multiple concurrency token properties on a resource? And depending on the database used, this could be typed as numeric, guid, timestamp, binary or something else.
- Given that PostgreSQL uses a number (uint xmin), should we obfuscate or even encrypt that? If the latter, we need to add an option for api developers to set the encryption key.

See also:
json-api/json-api#600
json-api/json-api#824
bart-degreed pushed a commit to json-api-dotnet/JsonApiDotNetCore that referenced this pull request Nov 9, 2021
*Update (2021-11-08): Rebased on latest changes in master branch*

Because we fetch the row before update and apply changes on that, a concurrency violation is only reported when two concurrent requests update the same row in parallel. Instead, we want to produce an error if the token sent by the user does not match the stored token. To do that, we need to convince EF Core to use that as original version. That's not too hard.

Now the problem is that there is no way to send the token for relationships or deleting a resource. Skipped tests have been added to demonstrate this.

We could fetch such related rows upfront to work around that, but that kinda defeats the purpose of using concurrency tokens in the first place. It may be more correct to fail when a user is trying to add a related resource that has changed since it was fetched. This reasoning may be a bit too puristic and impractical, but at least that's how EF Core seems to handle it.

Solutions considerations:
- Add 'version' to resource identifier object, so the client can send it. The spec does not explicitly forbid adding custom fields, however 'meta' would probably be the recommended approach. Instead of extending the definition, we could encode it in the StringId.
- Once we have access to that token value, we need to somehow map that to 'the' resource property. What if there are multiple concurrency token properties on a resource? And depending on the database used, this could be typed as numeric, guid, timestamp, binary or something else.
- Given that PostgreSQL uses a number (uint xmin), should we obfuscate or even encrypt that? If the latter, we need to add an option for api developers to set the encryption key.

See also:
json-api/json-api#600
json-api/json-api#824
bart-degreed pushed a commit to json-api-dotnet/JsonApiDotNetCore that referenced this pull request Nov 23, 2021
*Update (2021-11-08): Rebased on latest changes in master branch*

Because we fetch the row before update and apply changes on that, a concurrency violation is only reported when two concurrent requests update the same row in parallel. Instead, we want to produce an error if the token sent by the user does not match the stored token. To do that, we need to convince EF Core to use that as original version. That's not too hard.

Now the problem is that there is no way to send the token for relationships or deleting a resource. Skipped tests have been added to demonstrate this.

We could fetch such related rows upfront to work around that, but that kinda defeats the purpose of using concurrency tokens in the first place. It may be more correct to fail when a user is trying to add a related resource that has changed since it was fetched. This reasoning may be a bit too puristic and impractical, but at least that's how EF Core seems to handle it.

Solutions considerations:
- Add 'version' to resource identifier object, so the client can send it. The spec does not explicitly forbid adding custom fields, however 'meta' would probably be the recommended approach. Instead of extending the definition, we could encode it in the StringId.
- Once we have access to that token value, we need to somehow map that to 'the' resource property. What if there are multiple concurrency token properties on a resource? And depending on the database used, this could be typed as numeric, guid, timestamp, binary or something else.
- Given that PostgreSQL uses a number (uint xmin), should we obfuscate or even encrypt that? If the latter, we need to add an option for api developers to set the encryption key.

See also:
json-api/json-api#600
json-api/json-api#824
@jelhan jelhan added the extension Related to existing and proposed extensions as well as extensions in general label Feb 6, 2022
@jelhan
Copy link
Contributor

jelhan commented Feb 6, 2022

I think this could be very well addressed using an extension introduced by v1.1. This would also give room to experiment as such an extension does not require changes to the base spec. Overall it should speed up the process. We could later promote such an extension to an official extension, which should provide similar benefits as adding to the base spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Related to existing and proposed extensions as well as extensions in general
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants