Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify invocation process (or the need for request token) #140

Open
burdiyan opened this issue Jan 17, 2023 · 8 comments
Open

Clarify invocation process (or the need for request token) #140

burdiyan opened this issue Jan 17, 2023 · 8 comments

Comments

@burdiyan
Copy link

burdiyan commented Jan 17, 2023

There was a similar question before #3 (comment), which linked to another issue in which I asked my questions, but maybe because it's a closed issue it won't get noticed, so I'm creating a new one.

I was trying to understand better the UCAN spec, and just by reading the website and the spec, it wasn't immediately clear to me how the invocation procedure was supposed to work. Namely, it wasn't clear to me that the caller has to create a new request-specific token when calling the "protected" resource. I only understood that clearly after reading UCAN.Storage Spec.

Beyond that, I have some further questions about the need for the request token in general. I think it's easier to explain with a particular example, but I guess it boils down to asking "is there a need for an authenticated channel between the protected resource and the caller?".

  1. Say, there's a Web Server which issues a RootToken to Alice (issuer: Web Server, audience: Alice). This token allows to do anything.
  2. Then Alice gives an AttenuatedToken to Bob (issuer: Alice, audience: Bob, proofs: RootToken).
  3. Then Bob want to call the Web Server.

According to UCAN Spec as I understand it, Bob would have to issue a new request token (issuer: Bob, audience: Web Server, proofs: AttenuatedTokenFromAlice). And here's where I wonder if the Web Server must verify that the caller is actually Bob, and not someone else (or anonymous).

When I'm talking about validating the caller, I mean that somehow the Web Server knows the public key of the caller, and for example the actual request could be signed by the caller's private key.

My reasoning:

  1. If the server doesn't need to authenticate the caller, then why there's a need for a request token? The AttentuatedToken seems to be as good as the request token, because if the Server doesn't verify the issuer, then any issuer would work if the proof chain leads to the actual root token. Am I missing something here?
  2. And if the server does need to validate the issuer (e.g. by verifying the signature of the request payload), then again, I don't understand why request token is necessary. The Web Server could just as well accept the AttenuatedToken and verifying that proof chain leads to root token, and that the caller corresponds to the audience of the AttenuatedToken.
@cdata
Copy link
Member

cdata commented Jan 20, 2023

I will try to speak to this issue, but I suspect that @expede @matheus23 and @Gozala would do a better job, particularly when it comes to pointing out answers or recognizing ambiguities in the spec text as written.

Namely, it wasn't clear to me that the caller has to create a new request-specific token when calling the "protected" resource. I only understood that clearly after reading UCAN.Storage Spec.

This is good feedback. I remember this being unclear in the beginning as well. I remember that reading this FAQ at the end of the spec clarified the usage for me: https://github.com/ucan-wg/spec#122-what-prevents-replay-attacks-on-the-invocation-use-case

According to UCAN Spec as I understand it, Bob would have to issue a new request token (issuer: Bob, audience: Web Server, proofs: AttenuatedTokenFromAlice). And here's where I wonder if the Web Server must verify that the caller is actually Bob, and not someone else (or anonymous).

The use case that suits UCAN is: Web Server does not assume to know Bob (or any DID in the chain besides Alice) in advance. In other words, it trusts Bob because Alice trusts Bob.

If the server doesn't need to authenticate the caller, then why there's a need for a request token? The AttentuatedToken seems to be as good as the request token, because if the Server doesn't verify the issuer, then any issuer would work if the proof chain leads to the actual root token. Am I missing something here?

The server does need to authenticate the caller, by verifying that the caller Bob was authorized by Alice (which is proven by a signature). Alice, in turn, was authorized by Web Server (per your example; also proven by signature). If the AttenuatedToken is intended for invocation, it may not be "as good" as the original UCAN given to Alice. It may have a smaller scope of capabilities, and it may be time limited. Also, in the typical invocation case, the server will note that the UCAN has been seen once and won't accept the same one again.

Libraries help you with this. For example, in rs-ucan the proof chain is cryptographically verified on your behalf as part of deserialization of the UCAN and reading the capabilities it enables.

any issuer would work if the proof chain leads to the actual root token

This is correct, and considered a virtue in the context of UCANs. However, nothing is stopping you from centrally managing keys such that proof chains are limited to keys you have already vetted. But, a relatively centralized solution may be more appropriate if that is your use case.

And if the server does need to validate the issuer (e.g. by verifying the signature of the request payload), then again, I don't understand why request token is necessary. The Web Server could just as well accept the AttenuatedToken and verifying that proof chain leads to root token, and that the caller corresponds to the audience of the AttenuatedToken.

I think I may not totally understand what you mean, but I will offer:

There is no specific need for a token to be included within a transaction e.g., when invoking a REST API. But, the Web Server does need to have all issued UCANs in the proof chain in order to verify that the claimed capabilities of Bob were in fact originated from the Web Server and passed along by all intermediaries in the chain. And, per the spec an invocation token ought to be unique (https://github.com/ucan-wg/spec#622-token-uniqueness), so Bob will need to somehow give Web Server a unique UCAN for each invocation.

@burdiyan
Copy link
Author

Hi @cdata! Thank you for taking the time to respond!

I think I'm still a bit confused, so I'll try to clarify my points.

The server does need to authenticate the caller, by verifying that the caller Bob was authorized by Alice (which is proven by a signature). Alice, in turn, was authorized by Web Server (per your example; also proven by signature).

In my mind this doesn't authenticate Bob within the Web Server, because the Web Server doesn't have a way to prove that it's Bob who's talking right now. But if Bob would sign the entire request with his private key, or if there would've been some Challenge-Response authentication beforehand, then the Web Server could prove Bob's identity, in my understanding.

Getting to your point:

Web Server does not assume to know Bob (or any DID in the chain besides Alice) in advance. In other words, it trusts Bob because Alice trusts Bob.

That is totally clear for me. And I think it doesn't go against of what I said above. The Web Server doesn't need to know Bob in advance. It only needs a way to verify that Bob is actually Bob, and that Alice trusts Bob, hence Web Server can trust Bob as well. It seems to me that the whole protocol would've been much simpler if Bob would have to authenticate himself somehow (e.g. signing the request, or doing challenge-response auth described above).

So, it's not that the Web Server must know Bob in advance to trust him. It's that the Web Server would have to first make sure that the one who's supposedly Bob is actually Bob, and then the Server would follow the delegation chain to get to a point where it can see that Bob uses a token that is derived from something initially issued by the Server itself.

If the AttenuatedToken is intended for invocation, it may not be "as good" as the original UCAN given to Alice. It may have a smaller scope of capabilities, and it may be time limited. Also, in the typical invocation case, the server will note that the UCAN has been seen once and won't accept the same one again.

I think I wasn't clear with this initially. I wasn't talking about AttenuatedToken being as good as the RootToken. It was talking about AttenuatedToken being as good as RequestToken if authentication is required between Web Server and Bob.

It's clear that the RootToken issued by Web Server to Alice is more "powerful" than the AttenuatedToken. And my point here is that Bob could use the AttenuatedToken directly without having to create a new RequestToken. But it would work only if Bob will have to prove that he actually possesses the private key for the DID that Alice issued the AttenuatedToken for. This is what I meant by RequestToken being as good as the AttenuatedToken.

Also, in the typical invocation case, the server will note that the UCAN has been seen once and won't accept the same one again.

Thanks for clarifying this. I didn't realize this before. And I think now this is what my main point is about.

If the Web Server must store all the unexpired tokens that it has seen, this seems to me quite of a burden. Can't this even become an attack vector? E.g. one could issue request tokens with long expiration time, making the Web Server store them for this long time.

What's even more confusing for me with this whole scheme, is that it seems like quite of an unconventional use of a JWT. To be fair, this part is clarified in the spec clearly, where it talks about UCAN tokens not being simple bearer tokens like JWTs are. But the confusing part is that UCAN is "disguising" as a JWT (which has bearer token connotation implied), while UCAN is not a bearer token.

So, here's my main point: wouldn't the whole scheme become much simpler if Bob would have to use the AttenuatedToken directly, and prove to the Web Server that he possess the private key of the DID for which this AttenuatedToken was delegated?

Something like this:

+-----------+     +-------+                        +-----+
| WebServer |     | Alice |                        | Bob |
+-----------+     +-------+                        +-----+
      |               |                               |
      | RootToken     |                               |
      |-------------->|                               |
      |               |                               |
      |               | AttenuatedToken(RootToken)    |
      |               |------------------------------>|
      |               |                               |
      |          AttenuatedToken + Proof that I'm Bob |
      |<----------------------------------------------|
      |               |                               |

This way, the Web Server doesn't need to "remember" all the UCANs it has seen. Not even the RootToken issued to Alice.

Not having to remember tokens was kind of the whole point of JWT-based auth schemes initially, IMO (I can verify that the token is signed by me, so I don't need to keep track of the tokens I issue, unless it wants to be revoked explicitly). And this approach would follow the same idea, adding delegation capabilities to the original scheme.

I believe this how ZCAP spec is supposed to work, and this is probably the main difference between UCAN and ZCAP conceptually (besides the format of the messages).

@expede
Copy link
Member

expede commented Jan 27, 2023

Hey @burdiyan; thanks for the questions!

it wasn't immediately clear to me how the invocation procedure was supposed to work

Today most people signal their intention to perform an action via an HTTP endpoint, and pass the UCAN in the authorization header. We're just wrapping up formalizing a UCAN-based RPC that doesn't depend on custom HTTP endpoints, but either works.

According to UCAN Spec as I understand it, Bob would have to issue a new request token (issuer: Bob, audience: Web Server, proofs: AttenuatedTokenFromAlice). And here's where I wonder if the Web Server must verify that the caller is actually Bob, and not someone else (or anonymous).

UCAN depends only on public key infrastructure (PKI). Unlike OAuth, it's not interested at all in who the human is that's performing the action, just that there's an unbroken chain of authentication.

The Web Server could just as well accept the AttenuatedToken and verifying that proof chain leads to root token, and that the caller corresponds to the audience of the AttenuatedToken.

If I'm understanding you correctly, then there's no problem here: this is how it works! Evidently that isn't clear in the text. After we make sure that your question is answered, I'd love to pick your brain about how the wires got crossed there.

But if Bob would sign the entire request with his private key, or if there would've been some Challenge-Response authentication beforehand

The outermost UCAN has an audience of your server, and is signed with Bob's private key. It's clear who is being asked to perform the action, and no one else could have made that request (no PITM). Presumably you're making this request over HTTPS, so the channel is also secured and signed over transparently by your HTTP library.

Invoked UCANs are one-time use, so you "should" track the hashes of previously invoked UCANs so that they don't get replayed. You can expire those caches as the UCANs themselves expire (the exp field).

If the Web Server must store all the unexpired tokens that it has seen, this seems to me quite of a burden. Can't this even become an attack vector? E.g. one could issue request tokens with long expiration time, making the Web Server store them for this long time.

Each hash is 32-bytes, so they'd need to send a LOT of UCANs, and probably have more efficient methods of filling up your database (just HTTP logging the request alone will be larger than this). This falls into the same category of problem as anyone spamming your service: you probably want to rate limit them if they attempt to DOS you.

This way, the Web Server doesn't need to "remember" all the UCANs it has seen. Not even the RootToken issued to Alice.

In the picture as presented, you're open to replay attacks. You're right that zcap-ld doesn't mention this, which seems like an oversight. I'll drop Christine a line, and see if that was intentional. In either case, if you don't track a previously seen token, you're open to replay attacks, which is a choice that you're allowed to make in your implementation.

@expede
Copy link
Member

expede commented Jan 27, 2023

From the zcap spec (at the very bottom)

For example, Alice has granted capabilities (with caveats that permit revocation) to a number of users access to run programs on the systems she is administrating, but someone is using their capability to abuse the system. The system itself performs the invocations as the users call them, but Alice, or a program Alice runs, is able to analyze a log of past invocations called. Having finished the analysis, Alice realizes that the abuse is coming from the invocation of a capability granted to some of Mallet's authentication material, and Alice revokes that capability.

[Emphasis mine]

i.e. zcap also requires logging the previous tokens to prevent replay attacks and other abuse

@Gozala
Copy link
Contributor

Gozala commented Jan 27, 2023

For what it's worth we're not actively remembering seen UCANs we simply make invocations UCANs short lived ~30secs reducing attack vectors quite a bit. In our case invocation UCANs also encode all of the input in the nb fileld which means that multiple invocations are only a problem if operations aren't idempotent and in those cases you can often design around them e.g. by adding revision info of the state you're mutating in the invocation, that way you'll just refuse mutations if revision no longer matches.

@jeffgca
Copy link

jeffgca commented Jan 27, 2023

For what it's worth we're not actively remembering seen UCANs we simply make invocations UCANs short lived ~30secs reducing attack vectors quite a bit. In our case invocation UCANs also encode all of the input in the nb fileld which means that multiple invocations are only a problem if operations aren't idempotent and in those cases you can often design around them e.g. by adding revision info of the state you're mutating in the invocation, that way you'll just refuse mutations if revision no longer matches.

If this becomes a problem eventually ( you have large amounts of requests that aren't idempotent/ could be maliciously or accidentally replayed ) I think a combination of medium-length expiry values and a medium-length UCAN cache could be useful, eg all requests have a 30 minute ttl and you keep the last 35 minutes of tokens around. This way you can cheaply reject very old tokens and only have to do the work to invalidate recent and replayed UCANs via the lookup cache.

Systems doing this kind of thing should probably have utilities implemented in them that automagically request fresh UCANs instead of makling requests with expired ones.

@burdiyan
Copy link
Author

burdiyan commented Apr 25, 2023

I just bumped into this thread again accidentally, and realized that I haven't provided any response, sorry for that.

As a reminder, I was questioning the current UCAN invocation procedure, and the need for the request token (i.e. when the holder of the UCAN creates a new token when making the request to the server - the actual protected resource).

I totally understand the point of replay attacks and other things mentioned here. But I wanted to share a use case that might be beyond the realm of request-response communication, and could justify my point better.

We at Mintter work on a system where data is immutable, and there's basically no request-response. Users create messages, sign them, and make it available to the public. Then other users discover these messages, and interpret them locally however they like. Messages basically stay valid forever. There seems to be no place for replay attack or any other concerns where request token can be useful. The message is self-contained: it has the signer identity, the signature, and everything that's needed to validate it.

We totally share the "don't share keys between devices" premise, so we have one Account Key that delegates the authority to separate Device Keys. The Account Key is only used briefly to create the delegation, and never gets persisted.

So, the idea is that these messages are signed with the Device Key, but treated as if created by the identity of the Account Key. And with this setup, it seems to me that having to create the request token is wasteful, and totally unnecessary:

  • the whole payload is signed by the Device Key, so we can be certain it wasn't tampered with;
  • we have the Account ID in the message itself, so we also know that this Device Key acts on behalf of this Account Key;
  • so we just need to make sure that the Account Key has actually delegated the permissions to this Device Key, and for that we could either track these UCANs out-of-band, or include the CID of the UCAN in the message itself.

So these messages are like "deferred" invocations, plus they are interpreted by each client locally, instead of being presented back to the resource owner.

If you feel that it's a legitimate use case, maybe this could be mentioned in the spec as an option, and as an alternative to the existing invocation procedure? We could totally just use UCANs this way, but it seems like we're better off doing something bespoke, than using a specification in the way that's not specified.

EDITED:

I haven't read the Invocation Spec yet, maybe some of this is addressed there.

@Gozala
Copy link
Contributor

Gozala commented Apr 26, 2023

To quote the invocation spec https://github.com/ucan-wg/invocation/#1-introduction

Just because you can doesn't mean that you should

I think is a good insight here and basically differentiation between delegating something to someone vs requesting / submitting a message for to be processed. @burdiyan I would suggest embracing invocations as a way to publish those messages even though they're only semantically different from delegations.

An the relevant note your system reminds me of the design we have considered for the DAG publishing protocol. There we intentionally described messages not as UCANs that way agents (devices in your case) can build up their own message logs and only create UCAN invocation when publishing new head. This had couple of benefits:

  1. You can rollup multiple messages into single publish.
  2. Any actor could fork a message stream and publish into their own did

Not sure if the above is relevant or useful to you, but thought I'd mention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants