Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(not for merge) first-cut pub-sub proposal #513

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

meejah
Copy link
Contributor

@meejah meejah commented Jul 17, 2018

Not yet ready for merge; for discussion purposes.


This change is Reviewable

Copy link
Member

@crwood crwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments...

Magic Folders in Tahoe currently poll, looking for updates. This
involves periodically downloading some mutable directory-capabilities
and determining if anything has changed. This is wasteful of bandwidth
and server resources. It would nice if a client, Alice, could tell
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is wasteful of client resources too. :)

(Maybe remove "server" and just say "resources"?)


The above mechanism would scale somewhat poorly as the number of
"broadcast" capabilities increased. It would be better if the
storage-server could *tell* clients when a mutable capability has
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Maybe I'm misunderstanding how tahoe's mutables work under the hood here, but shouldn't the storage node only be notifying client nodes when a share has updated/changed (and not a full capability)? Reasons being: 1) we should not ever want or encourage users/clients to expose the contents of their files to storage nodes (i.e., which, in this case, happens since the storage node receives the cap in the "subscribe" message below) and 2) in an introducerless environment, storage nodes have no knowledge of (or connection to) other the storage nodes that comprise the "grid" (and thus may not be able to tell when a capability has updated unless they're also "subscribed" to the other storage nodes -- which is inefficient -- and/or have some way of monitoring how/where shares are placed on other nodes -- which requires granting excess authority, etc.)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, good catch Chris :) should have had that second coffee before pushing this ;)

Yes, storage-servers only really know about shares. I'll have to look again at the mutable code/specs but it might make sense to trigger off "slots". And you're right we should definitely NOT be giving an actual capability to a server :/

@codecov-io
Copy link

codecov-io commented Jul 18, 2018

Codecov Report

Merging #513 into master will increase coverage by 2.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #513      +/-   ##
==========================================
+ Coverage   83.11%   85.14%   +2.02%     
==========================================
  Files         152      152              
  Lines       28049    28049              
  Branches     4013     4013              
==========================================
+ Hits        23313    23881     +568     
+ Misses       4045     3471     -574     
- Partials      691      697       +6
Impacted Files Coverage Δ
src/allmydata/web/status.py 83.2% <0%> (-0.23%) ⬇️
src/allmydata/uri.py 92.96% <0%> (+0.31%) ⬆️
src/allmydata/util/encodingutil.py 90.05% <0%> (+0.52%) ⬆️
src/allmydata/immutable/upload.py 95.53% <0%> (+0.55%) ⬆️
src/allmydata/util/fileutil.py 50.56% <0%> (+0.56%) ⬆️
src/allmydata/immutable/encode.py 95.94% <0%> (+0.95%) ⬆️
src/allmydata/mutable/publish.py 96.65% <0%> (+1.01%) ⬆️
src/allmydata/nodemaker.py 100% <0%> (+2.06%) ⬆️
src/allmydata/immutable/literal.py 91.35% <0%> (+3.7%) ⬆️
src/allmydata/util/consumer.py 100% <0%> (+8%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update efa0b26...be8c973. Read the comment docs.

@tpltnt
Copy link
Contributor

tpltnt commented Jun 26, 2019

This PR is almost a year old. How should we proceed?

Copy link
Member

@exarkun exarkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Sorry about the long delay in providing feedback. :)

I've left some comments inline. Generally I think this looks like a good and workable idea. It would probably be useful to have a ticket in the issuer tracker describing the big picture, outlining the problem to be solved, etc? I looked but didn't find one.


- for a given Slot, does a client just ask every storage-server for
the corresponding Storage Index? (are there information-disclosure
issues here?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how difficult it is for clients to coalesce N update notifications (one from each server to which it subscribes) into what is likely usually 1 underlying mutation to the slot data.

Relatedly, I wonder how a client decides which storage servers on which to subscribe to updates. SDMF and MDMF capabilities don't embed n/k information, right? I guess this is just a specific example of a more general problem where the client has to rely on the node-wide n/k configuration and hope that it is valid for each specific capability it interacts with.

Also related, the client won't be able to retrieve the updated content until at least n storage servers have the updated content, so I guess the client will have to wait for at least n updates from n different storage servers before trying to retrieve the updated content. In the face of concurrent writes, this still doesn't give the client an unambiguous signal - but concurrent writes are fraught anyway and it's likely no one should be attempting them?

It would be nice if we could avoid "every storage server", though. "Every storage server" makes life difficult for any large-scale storage provider. Perhaps "the top k (connectable) servers in the permuted list for the storage index" is enough? What I really mean is "exactly the servers that the writer is going to send updates to" - assuming it is possible for a reader to compute that same list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatedly, I wonder how a client decides which storage servers on which to subscribe to updates.

Since this is for updates to existing mutables, we'll know for sure, no? (That is, when we ask for the updates, that sever will know if it has those or not). As an optimization we could say something like "clients MAY choose to only connect to N or more servers for the mutables they're interested in".

And yes, we should include more words about N / K issues I think (that is, the client must contact at least N (and at most K or "happy") servers and only accept that an update "has happened" when N say they've gotten it).

Still, for a version 0 I think the total bandwidth will still be a lot lower if we just connect to every storage-server we know about (versus contacting N storage-severs that have a share of that mutable every X minutes). I guess the proposal could be strengthened by including some math about this point ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now included words about coalescing (and a maximum-delay option). I haven't addressed any N / K stuff (yet).

Copy link
Contributor Author

@meejah meejah Jan 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this ticket: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2555

(It's not that relevant because it uses a Foolscap based API proposal, and we don't want to introduce new Foolscap APIs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's steal it.

machine-readable "code" and a human-readable "message"?

All subscriptions for a particular client shall be removed when that
client's WebSocket connection goes away.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients will have to contend with missed updates while they are disconnected. This might not be a big deal for offline times that are planned/expected/intended and non-trivial in duration. It might be a big deal for offline times due to transient network conditions or ongoing poor network quality. If a client disconnects every couple minutes and has no way to receive updates for its time offline then it will basically have to fall back to polling. Maybe these network conditions are worse than what we should be considering, I don't know.

This could be addressed by allowing for a small buffer on the storage-server and some kind of session identifier. Reconnecting clients could try to re-establish their session and receive any missed updates and then only fall back to polling if they have been offline so long the storage server decided to forget about them.

If a server offered just a few minutes of buffer this would probably cover the majority of unintended disconnection. The work saved in not having to service catch-up-polling presumably even offsets the cost of maintaining those buffers, up to some threshold.

It does add complexity to both the client and server implementation of pubsub and it doesn't even save any complexity on the client since clients must still fall back to polling if the server hasn't saved their session. The performance benefits may be worth the extra complexity, though.

It also adds a possible DoS vector on storage servers since it allows clients to commit memory on servers and then release their own resources (disconnect and start a new garbage session, repeat). This could be mitigated by accounting for the total number of sessions on the storage server when deciding how long a session lives, I suppose? This changes the victim of the DoS from the storage server to any storage clients that are counting on their session to avoid having to poll. And without a buffer they have to poll anyway, every time, so ... still a win?


{
"tahoe-mutable-notification-version": 1,
"update": "storage-index-1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be "updates": [...] and storage servers should be allowed to delay notification briefly to possibly bundle up several changes in one notification.

Would this actually be a win? I suppose for a highly active magic-folder you might sometimes see two different peers changing their own version of the files at close to the same time. You're going to have to watch each of their DMDs so it might be nice to get one update that says N clients have changed files. Eh. It sounds pretty thin. Possibly not worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a further thought to that, perhaps the subscribing client could include an option maximum-delay or similar specifying how many seconds the server can wait before sending updates.

Thinking about this, there's huge tension trying to make a "generic" protocol here I feel .. if you're doing real-time collaborative editing, you don't want to wait "at all". In that case, perhaps a direct peer-to-peer style connection is better? On the other hand, if you're NOT doing real-time editing, the delay could be quite a bit longer. All that said, the proposal here does seem more about "real-time" edits since it depends on an active WebSocket connection .. so maybe maximum-delay should be in milliseconds.

Additionally, maybe we should make this more explicit in this spec: there is a mechanism to tell your client to go into "rapid update" mode (and to turn that mode off). When it's on, the client makes active WebSocket connections to the server; when it's off it does periodic polling (or even "nothing, until told otherwise"). Although we don't have a phone application, this would map well to those use-cases (as far as I can understand them): if the phone sleeps (or the app goes out of focus) we're in "polling" or even "don't care" mode; if the app is active, we want immediate updates (because presumably a user is touching their fondle-slab at that moment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat skeptical that we're going to achieve a reasonably performing real-time collaborative editing system on top of storage-server-mediated updates. Particularly considering there are still essentially no facilities for multi-writer... I think instead what we should be aiming for now is a "pretty fast" sync experience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair.

Still, I could imagine GridSync being smart about "my human is here" (so make WebSocket connections etc) versus "my human is gone" (e.g. screensaver turned on) so I don't really care about updates now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed changes making that "updates"

.. and while I agree that magic-folder isn't going to be a multi-user collaborative thing in the immediate future, it's still worth thinking about these issues, especially in the context of "pub-sub" -- a CRDT (for example) could be built on top of mutables + notifications for "actual" real-time collaboration (or any other data-structure that could be stored in a mutable, which is basically "any data-structure you can serialize").

:ref:`http-storage-node-protocol` with which this protocol aims to be
compatible. A single WebSocket endpoint exists:

- <server>/v1/mutable_updates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, to get in on the bike shedding early, I like - as a word separator in URLs rather than _.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, that's fine with me :)

@meejah meejah marked this pull request as draft May 6, 2020 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants