Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An on-grid introducer that doesn't depend on Foolscap or separate servers #882

Draft
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

exarkun
Copy link
Member

@exarkun exarkun commented Oct 29, 2020

Copy link
Contributor

@meejah meejah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start!

Besides the inline comments/ideas I think this could benefit from some discussion of how the propagation actually works. I think I mostly know from previous discussion/context ;) but there's e.g. no mention of "polling" in here at all or how long changes might take to propagate. That is, when does a client download the directory to search for new clients? What does it do then? What if it's a storage-server too? What if a client is now missing?

Perhaps the above could be rolled into some examples? e.g. use-case style discussion: "entirely new grid", "add a storage-server", "remove a storage-server", etc.

docs/grid-introducer-internals.rst Outdated Show resolved Hide resolved
docs/grid-introducer-internals.rst Outdated Show resolved Hide resolved
docs/grid-introducer-internals.rst Outdated Show resolved Hide resolved
Management Command Line
-----------------------

Storage server enrollment is done using the ``grid-introducer`` command line tool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be tahoe grid-introducer similar to tahoe grid-manager or other tahoe things?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to be consistent here, yea. I'm not really excited about putting more stuff beneath "tahoe ..." because it means having to interact with the implementation of "tahoe ..." which is kind of a confusing mess.

Did I already try to convince you to make it grid-manager instead of tahoe grid-manager? :) I can't recall...

Copy link
Contributor

@meejah meejah Nov 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember having that conversation. I did have "tahoe gm" originally but .. maybe it was warner I talked about that with?

I know we've talked about "probably there's too much stuff in tahoe *" and I agree ... but "grid-manager" or "grid-introducer" by themselves seem maybe-weird? (like "disconnected from tahoe"). Can we use Click for "new commands that don't start with tahoe ? :)

docs/grid-introducer-internals.rst Outdated Show resolved Hide resolved

The two pieces of client configuration required by the system can be generated from this state.
The ``grid-introducer.cap`` value is just the read-only capability for ``collection-writecap``.
The ``grid-introducer.furl`` value is the storage fURL for any currently enrolled storage server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, okay so this is the "how do we find a storage-server" part.

I feel like this should almost certainly be a list or we've still got a "single point of failure" which was, I thought, part of the motivation for this. It might be a different point of failure for each client, too, since this says "any" storage-server.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think it's fine to have a list here. One thing it looks like I failed to capture, though, is that this should have the same SPoF properties as the current introducer. If it's offline the first time you show up, you're out of luck. If a client manages to talk to it once, the client should remember all the storage servers it discovers and should be able to use those if it can't connect to whatever servers are in the grid-introducer.furl at any future point.

So it's a bootstrap spof I guess? But yea, putting more servers in here is a good way to further improve the behavior. The value itself gets somewhat unwieldy so I think we probably don't want to say "put all your servers in it" but maybe 2 or 3 is a good balance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess see my other comments, though, because I think this should maybe be "state" and not "config" at all...

grid-introducer.furl = pb://sokl...@192.168.69.247:44801/eqpw...

Start a Tahoe-LAFS client node with these items configured and the client will be able to find and follow all storage servers that are part of that grid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the "connection hint" here should be "state" and not "config" because it should get updated (e.g. if that one storage-server is taken offline, this config will never be able to bootstrap again). If it's "state" you'd want to "add" and "remove" storage-servers from it and can use any/all of them to download new information. We already have a mechanism for this, AFAIK, since storage-server announcements are cached. Also "introducer-less" mode uses this.

That leaves "how to enroll a new client", because they can't just shove a thing into their config. Perhaps an "online" invite method (e.g. magic-wormhole) is too complex for this document? But it might be better to specify a single "invite a new client" document -- this could be communicated via magic-wormhole but also could simply be fed to a client via command-line (so that the magic-wormhole version is indeed simply "a built-in secure way to communicate that invite document").

So I'd be thinking along the lines of tahoe join-grid <path-to-invite-document> as the first version and tahoe join-grid --code 3-foo-bar or so for the improved magic-wormhole version. Then, all that stuff (the "cap" and a list of storage-servers) is put in "state" and the user doesn't copy-paste it. These are very long-term secrets and I don't think the user should really be trusted to not leak them (by having to put them into the config file).

The "invite document" is just the same information as above (although ideally with a list of server fURLs instead of just one).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the "connection hint" here should be "state" and not "config" because it should get updated (e.g. if that one storage-server is taken offline, this config will never be able to bootstrap again). If it's "state" you'd want to "add" and "remove" storage-servers from it and can use any/all of them to download new information. We already have a mechanism for this, AFAIK, since storage-server announcements are cached. Also "introducer-less" mode uses this.

Oh. Yea. So basically what I wrote above about bootstrap spof and list vs singleton. :) I'll put more words in the docs to make the extra intended client behavior clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That leaves "how to enroll a new client", because they can't just shove a thing into their config. Perhaps an "online" invite method (e.g. magic-wormhole) is too complex for this document? But it might be better to specify a single "invite a new client" document -- this could be communicated via magic-wormhole but also could simply be fed to a client via command-line (so that the magic-wormhole version is indeed simply "a built-in secure way to communicate that invite document").

I think they can just shove a thing into their config. The config is "what the user told me I could use to bootstrap" and the state is "everything I learned afterwards". And the client behavior should be to synthesize a list of servers from these two pieces of information and try to retrieve announcements from each (all?) of those servers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'd be thinking along the lines of tahoe join-grid as the first version and tahoe join-grid --code 3-foo-bar or so for the improved magic-wormhole version. Then, all that stuff (the "cap" and a list of storage-servers) is put in "state" and the user doesn't copy-paste it. These are very long-term secrets and I don't think the user should really be trusted to not leak them (by having to put them into the config file).

That seems more or less fine as a user experience, too, and could improve the bootstrap process - maybe even by sending all of the current announcements. I think there's probably value in a setup process that is independent of magic-wormhole. So keeping either the tahoe.cfg route or the tahoe join-grid <path> thing even after there's a magic-wormhole-enabled version probably makes sense.

I'll try to edit some of this into the document too.

Copy link
Contributor

@meejah meejah Nov 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the general idea of the tahoe join-grid <some file> suggestion is that then it can be exactly the same for a magic-wormhole use-case and if it's a JSON blob, the user is very unlikely to copy/paste it wrong (because it's all or nothing).

So essentially this means the "grid-introducer" information is "some JSON/whatever document" and we have one delivery mechanism for now (out-of-band + command-line tool) and in the future we can have other delivery mechanisms (magic-wormhole).

Even without the CLI command, it could be that tahoe.cfg points at "a grid-introducer document/file" (which has the distinct advantage of not putting yet another secret in tahoe.cfg .. even if that ship kind-of sailed already with the introduer-furl).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ also makes it easier to evolve: if we change the information, it's still "a grid-introducer document" but the version might be 2 or 3 or whatever but we don't have to try and support some new tahoe.cfg items...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yea. Separate file. Great idea. That definitely seems like the way to go.


This is an error if ``<path>`` exists already.
If it does not then a new grid introducer configuration is written there.
If ``<path>`` is ``-`` then the configuration is written to stdout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Then, you will add it to the announcement directory::

grid-introducer add-storage-server --config <path> --readcap URI:CHK-RO:5cmy...

Copy link
Contributor

@meejah meejah Nov 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, it said these are indexed by public-key; where does it get that from? (And I guess if we have pet-names, this is where the grid-operator could assign one).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public key should be in the announcement, so it would go and read the announcement to get it (maybe also doing some basic checks to make sure you didn't supply a write cap by mistake, or the cap or some unrelated data, or copy/paste it wrong, etc).

If we link by petname, which I think is better, the petname can be a command line argument or maybe we could have a --use-announcement-nickname option that makes it read the nickname from the announcement (because there is one) and use that. Handy if one person actually controls both sides and wants to save a little typing or remove one chance to copy/paste the wrong string. Maybe. Maybe that's a follow-up feature (if this even gets implemented).

A Tahoe-LAFS client node which is to use the grid introducer needs to be configured with a couple items.
A configuration blob for clients can be generated like this::

grid-introducer generate-client-config --config <path>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following my idea from above, this could be generate-invite or similar instead and then the user doesn't need to copy-paste individual items out of a file, they simply inject the invite into their client's tahoe join-grid or similar command. For local use this could just use unix pipes (tahoe grid-introducer --config testgrid/introducer invite-client | tahoe --node-directory testgrid/alice join-grid --invite - or so).

@codecov
Copy link

codecov bot commented Nov 6, 2020

Codecov Report

Merging #882 (5828de8) into master (874bfa7) will increase coverage by 0%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #882     +/-   ##
========================================
  Coverage      92%     92%             
========================================
  Files         158     158             
  Lines       27487   28691   +1204     
  Branches     4113    4458    +345     
========================================
+ Hits        25321   26488   +1167     
- Misses       1493    1534     +41     
+ Partials      673     669      -4     
Impacted Files Coverage Δ
src/allmydata/util/dictutil.py 84% <0%> (-16%) ⬇️
src/allmydata/_auto_deps.py 95% <0%> (-5%) ⬇️
src/allmydata/util/fileutil.py 72% <0%> (-4%) ⬇️
src/allmydata/immutable/layout.py 91% <0%> (-1%) ⬇️
src/allmydata/util/_python3.py 100% <0%> (ø)
src/allmydata/storage_client.py 97% <0%> (+1%) ⬆️
src/allmydata/node.py 98% <0%> (+1%) ⬆️
src/allmydata/client.py 95% <0%> (+2%) ⬆️
src/allmydata/util/configutil.py 99% <0%> (+3%) ⬆️
src/allmydata/scripts/common.py 97% <0%> (+3%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 874bfa7...5828de8. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants