Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebAuthn support to GUI #9175

Draft
wants to merge 93 commits into
base: main
Choose a base branch
from
Draft

Add WebAuthn support to GUI #9175

wants to merge 93 commits into from

Conversation

emlun
Copy link
Member

@emlun emlun commented Oct 15, 2023

Fixes #8409. Replaces PR #8417.

Remaining things to do

  • Add tests
    • Credential registration
    • WebAuthn authentication
    • Password or WebAuthn authentication
    • Config update: advanced settings
    • Config update: credentials
  • Open documentation PR

Purpose

This adds support for WebAuthn authentication in the GUI. WebAuthn is a W3C standard that provides an API for strong, privacy-conscious public-key authentication, which can enable a both more pleasant and more secure user experience than traditional password authentication. WebAuthn both works with external security keys (typically via USB or NFC) and is built into many modern browsers and operating systems.

For users who already use WebAuthn, this both cuts out a detour through a password manager and improves security by making it possible to not need a password at all.

I'm opening this as a draft PR so people can try it out while I work on finishing the last few things listed above.

Testing

Manual tests performed:

  • With no password set and no (eligible) WebAuthn credentials enrolled:

    • No authentication is required
  • With password set OR at least one (eligible) WebAuthn credential enrolled:

    • "Log in with WebAuthn" button added to login page
    • With no WebAuthn credentials enrolled, WebAuthn button is disabled
    • Password authentication works (no WebAuthn required)
    • WebAuthn authentication works (no password required)
    • WebAuthn authentication is available immediately without server restart
  • WebAuthn settings/credential management:

    • Warning appears when HTTPS is disabled
    • Warning appears when RP ID does not match page domain
    • Warning appears when WebAuthn origin does not match RP ID
    • Warning appears when changing RP ID
    • RP ID and expected origin can be changed in advanced settings
      • "Ineligible credentials" table appears if any credentials were created with a different RP ID than the current setting
    • New credentials appear immediately in the table in GUI settings, but take effect on saving the GUI settings
    • Renaming credentials takes effect on saving the GUI settings
    • Credential names fall back to credential ID when not set
    • Deleting credentials takes effect on saving the GUI settings
    • PUT /rest/config ignores WebAuthn credentials not already registered (identified by ID)
    • excludeCredentials is used to prevent registering the same authenticator twice
    • When "Require UV" is checked for a credential, user verification (PIN or biometric) is required when using that credential (hacking the frontend to override the authentication arguments with userVerification = "discouraged" fails).

I intend to add automated tests as well, but would first like a review of the overall design and how it fits into the existing architecture.

Screenshots

WebAuthn button added to login form
WebAuthn button added to login form

New GUI settings
New GUI settings

Registering a new credential
Registering a new credential

Renaming a credential
Renaming a credential

Interactive warnings, example 1 Interactive warnings, example 2
Interactive warnings, example 1 Interactive warnings, example 2

Ineligible credentials
Separate table of ineligible credentials may appear if "Webauthn Rp Id" is changed in advanced settings

Documentation

Probably a new section should be added alongside "LDAP Authentication". WebAuthn normally doesn't require much technical knowledge from the user since you only interact with the end-user side, but in the Syncthing case you are both end-user and server administrator. WebAuthn requires some domain settings to line up, so the "RP ID" and "WebAuthn Origin" settings and their impact need to be explained here. I volunteer to write this documentation once I have the green light to finish the feature.

@emlun
Copy link
Member Author

emlun commented Oct 15, 2023

By the way: to test this, you don't need a YubiKey or other dedicated security key hardware. There are a few other options:

  • Most platforms other than Linux have a built-in WebAuthn authenticator, which will be offered when available.
  • Several browsers show an option like "use a mobile device", which displays a QR code. Recent versions of Android and iOS can scan this QR code to pair the phone with the other device, and then the phone can act as a WebAuthn authenticator.
  • In Chrome, you can use the virtual authenticator built into the devtools: Devtools menu > More tools > WebAuthn > Enable virtual authenticator environment > Create an authenticator with Protocol: ctap2, Transport: internal, Supports user verification: Yes. Note that any credentials created are deleted if you close the devtools, and external security keys are unavailable while "Enable virtual authenticator environment" is checked.

@foxxcomm
Copy link

Emlun --

Kudos to you and all the contributors for getting the HTML login form merged in preparation for WebAuthn support! Running v1.26.0-rc.1 here under the various browsers on Windows Server and client and its great finally being able to use password managers. Huge modernization and improvement!

Comments on the WebAuthn GUI design you posted:

Would it make sense to fill in the relaying party ID field with a default value (if blank) derived from the Syncthing Device Name? For example if the Syncthing device name was "Ranger" under settings, the RP ID field default could be "Syncthing Ranger". I'm guessing here that credential prompts will show the RP ID to identify the entity asking for credentials?

If that assumption above is correct, I think this would make it easier for a non-technical user to get started with WebAuthn. This way a user could simply hit +Add and not have to understand RP ID until and unless the user wanted to choose another RP ID value.

@emlun
Copy link
Member Author

emlun commented Oct 23, 2023

Would it make sense to fill in the relaying party ID field with a default value (if blank) derived from the Syncthing Device Name? For example if the Syncthing device name was "Ranger" under settings, the RP ID field default could be "Syncthing Ranger".

No, but also yes, the implementation already does.

The WebAuthn spec has two identifiers for the RP: the RP ID, which is the "real" one which defines credential scope, and the "RP name", which is purely for display.

The RP name is a free parameter; the implementation here sets it to "Syncthing" by default and "Syncthing @ <device name>" if the device configuration is readable and has a nonempty value. The idea for the RP name is indeed that clients (browsers) would display it in credential pickers, but in practice all current client implementations just display the RP ID. Therefore the RP name is currently not configurable in the GUI, since it's fairly useless in practice.

The RP ID, however, must equal or be a parent domain of the webpage that performs the WebAuthn request - so if the GUI is hosted at localhost (with or without port), then the RP ID must be set exactly to localhost. If the GUI is hosted at syncthing.myprivatenetwork.org, then the RP ID must be set to either syncthing.myprivatenetwork.org or to myprivatenetwork.org. Credentials created with a given RP ID can only be used with that exact RP ID, so once you've created any credentials, those existing credentials will stop working if you change the RP ID.

The implementation here sets the RP ID to localhost by default. If the settings GUI is accessed at an address with a different host, then the settings GUI warns that the RP ID doesn't match the host address (see screenshots above).

The other configuration field, "WebAuthn Origin", also relates somewhat to the RP ID. This is the address (including scheme and port, but not path) where the GUI is expected to be hosted; the client encodes this address into the signed assertion and the WebAuthn verification steps check that it equals the "WebAuthn Origin" value configured here. The default value of "WebAuthn Origin" is https://<RP ID>[:<GUI Listen Address port, if any>].

I'm guessing here that credential prompts will show the RP ID to identify the entity asking for credentials?

Yes - in theory the RP name would also be displayed, but in practice all current client implementations show only the RP ID. But as explained above, the RP ID is severely restricted by the credential scoping rules.

this would make it easier for a non-technical user to get started with WebAuthn. This way a user could simply hit +Add and not have to understand RP ID until and unless the user wanted to choose another RP ID value.

This is already the case if the GUI is hosted at localhost. 🙂 The placeholders shown in the "WebAuthn RP ID" and "WebAuthn Origin" fields are defaults that take effect if nothing else is configured. These defaults should work without further tweaks in the basic case where the GUI is accessed only at localhost.

Does that answer your questions?

@imsodin
Copy link
Member

imsodin commented Oct 23, 2023

That all sounds pretty complex, and also mostly static. Does that need to be exposed to the user in the main config interface? Seems like having reasonable default values and then keeping it advanced only would be enough and cause the least confusion (users tend to mess with anything that's being presented to them, regardless of if they have a need/understand it).

@foxxcomm
Copy link

Emlun --

Thanks for the detailed explanation for us non-WebAuthn developers. I concur with imsodin about moving these fields under Advanced options. Given those fields are likely to cause confusion for non-technical Syncthing users and will be fine for these users at default values.

@emlun emlun marked this pull request as ready for review April 7, 2024 21:05
@emlun
Copy link
Member Author

emlun commented Apr 7, 2024

Alright, this is finally ready for final review to merge! 🎉 Thanks everyone for your patience. All that remains now is documentation, which I'll open as a separate PR.

  • I don't know why one Windows build and one macos build is failing, and neither has an actionable error message. These builds seem a bit unreliable, so I intend to ignore them unless instructed otherwise.
  • The DeepSource warnings do not originate in this PR, except for some of these. GUIConfiguration already exhibits these on main, and frankly to me this lint sounds like actively bad advice, so I intend to ignore that too unless instructed otherwise.

@tomasz1986
Copy link
Contributor

tomasz1986 commented Apr 7, 2024

I will try to look at the GUI side of things sometime at the end of the week!

At the moment, I would just like to ask one question about the name itself. "WebAuth" appears to be a shortened form of "Web Authentication" (see https://en.wikipedia.org/wiki/WebAuthn). Wouldn't it be preferable to use "Web Authentication" in full, at least in the user-facing HTML? This is because the form "WebAuthn" doesn't really mean anything to someone who doesn't already know what it is, especially if they don't speak English.

What do you think? Also, would you be able to provide a few examples of wording some other major services or websites use when they offer WebAuthn login options?

@imsodin
Copy link
Member

imsodin commented Apr 7, 2024

They do have error messages, just very well hidden - likely due to the concurrent test runs. Would still be nice if go at least indicated the failing tests at the end of the output.

Windows one is in webauth stuff and I don't understand it at first glance:

2024-04-07T20:54:39.3038505Z === CONT TestWebauthnConfigChanges/Cannot_edit_WebAuthnCredential.LastUseTime
2024-04-07T20:54:39.3042093Z api_test.go:2983: Put "http://127.0.0.1:62358/rest/config/gui": EOF []
2024-04-07T20:54:39.3043519Z --- FAIL: TestWebauthnConfigChanges/Can_edit_WebauthnCredential.RequireUv (0.23s)
2024-04-07T20:54:39.3046995Z === CONT TestWebauthnConfigChanges/Cannot_edit_WebAuthnCredential.CreateTime
2024-04-07T20:54:39.3050086Z --- FAIL: TestWebauthnConfigChanges/Can_edit_WebauthnCredential.Nickname (0.20s)

macos looks like a plain boring timeout, probably due to concurrent execution and unrelated to your PR:

2024-04-07T20:57:19.0309640Z === RUN TestConcurrentIndexID
2024-04-07T20:57:19.0363030Z api_test.go:560: Get "http://127.0.0.1:49223/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

2024-04-07T20:57:19.0451400Z --- FAIL: TestHTTPLogin/200_path/incorrect_password_is_rejected (5.07s)
2024-04-07T20:57:19.0549970Z --- FAIL: TestHTTPLogin/200_path (16.08s)
2024-04-07T20:57:19.0651810Z --- FAIL: TestHTTPLogin (16.09s)
2024-04-07T20:57:19.0753440Z panic: test executed panic(nil) or runtime.Goexit
2024-04-07T20:57:19.0853460Z
2024-04-07T20:57:19.0955320Z goroutine 454 [running]:
2024-04-07T20:57:19.1055660Z testing.tRunner.func1.2({0x1021bd900, 0x102ac1e10})
2024-04-07T20:57:19.1157440Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/testing/testing.go:1631 +0x3f7
2024-04-07T20:57:19.1257590Z testing.tRunner.func1()
2024-04-07T20:57:19.1359390Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/testing/testing.go:1634 +0x6b6
2024-04-07T20:57:19.1459750Z runtime.Goexit()
2024-04-07T20:57:19.1561600Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/runtime/panic.go:626 +0x5e
2024-04-07T20:57:19.1661690Z testing.(*common).FailNow(0xc0000a9040)
2024-04-07T20:57:19.1763420Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/testing/testing.go:1005 +0x85
2024-04-07T20:57:19.1864580Z testing.(*common).Fatal(0xc0000a9040, {0xc000c91d00, 0x1, 0x1})
2024-04-07T20:57:19.1966720Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/testing/testing.go:1082 +0x88
2024-04-07T20:57:19.2067200Z github.com/syncthing/syncthing/lib/api.httpRequest({0x101cc6859, 0x3}, {0xc00003cd68, 0x17}, {0x0, 0x0}, {0x101cc742d, 0x5}, {0x101cc85d1, 0x7}, ...)
2024-04-07T20:57:19.2168930Z /Users/runner/work/syncthing/syncthing/lib/api/api_test.go:560 +0x829
2024-04-07T20:57:19.2269570Z github.com/syncthing/syncthing/lib/api.httpGet(...)
2024-04-07T20:57:19.2371660Z /Users/runner/work/syncthing/syncthing/lib/api/api_test.go:567
2024-04-07T20:57:19.2473260Z github.com/syncthing/syncthing/lib/api.TestHTTPLogin.func1({0xc00003cd68, 0x17}, {0x101cc742d, 0x5}, {0x101cc85d1, 0x7})
2024-04-07T20:57:19.2574140Z /Users/runner/work/syncthing/syncthing/lib/api/api_test.go:590 +0xe5
2024-04-07T20:57:19.2675800Z github.com/syncthing/syncthing/lib/api.TestHTTPLogin.func4.1.2(0xc00192a680)
2024-04-07T20:57:19.2776540Z /Users/runner/work/syncthing/syncthing/lib/api/api_test.go:631 +0x96
2024-04-07T20:57:19.2900250Z testing.tRunner(0xc00192a680, 0xc003b3b740)
2024-04-07T20:57:19.3002070Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/testing/testing.go:1689 +0x21f
2024-04-07T20:57:19.3102190Z created by testing.(*T).Run in goroutine 452
2024-04-07T20:57:19.3200060Z /Users/runner/hostedtoolcache/go/1.22.2/x64/src/testing/testing.go:1742 +0x826

As for the linter: Clearly not an issue here given the state of GUIConfiguration. I am not quite sure why we deal with values (value receivers) for config structs in most places, given those are large structs/proto messages with existing/generated pointer receiver methods. And I'd tend to agree with the linter that always using pointer or value receivers for a single type is a good idea, just to remove one possible variation and thus potential complexity/source of confusion.

@emlun
Copy link
Member Author

emlun commented Apr 8, 2024

Wouldn't it be preferable to use "Web Authentication" in full, at least in the user-facing HTML? This is because the form "WebAuthn" doesn't really mean anything to someone who doesn't already know what it is, especially if they don't speak English.

I somewhat agree, but also therefore disagree. 🙂 "Web Authentication" is indeed the title of the spec, but it does a pretty bad job as a name. "Web Authentication" could mean any form of authentication on the web, but "WebAuthn" is now a pretty well-established name for this standard specifically. I would keep the "WebAuthn" term to distinguish from other forms of "authentication on the web", such as passwords and OTP codes.

would you be able to provide a few examples of wording some other major services or websites use when they offer WebAuthn login options?

Few or none use the term "WebAuthn" in user-facing UI, instead most use the terms "passkeys" and/or "security keys". GitHub, for example, uses both. I've considered changing to "passkeys" in the UI, but I hesitate because that's not quite technically accurate to how we use WebAuthn.

A "passkey" is usually defined as a WebAuthn credential created with residentKey: "required" to have it stored on the client-side instead of encrypted into the credential ID stored on the server - this way you can use it as the first authentication step to both authenticate and identify yourself simultaneously, without first having to identify yourself to the server so the server can return your credential IDs. But because a Syncthing server only has a single user we don't need to identify the user first, so we can emulate the passkey user experience even though the credentials don't technically need to be "real" passkeys.

Another thing that somewhat sets Syncthing apart is that the end-user is also the service operator. This should hopefully not matter much with the default settings, but it does matter as soon as you want to host the web UI on some other address than https://localhost:8384 - then you need to understand the WebAuthn concepts of RP ID and credential origin. I think it's worth using the term "WebAuthn" at least in documentation and advanced settings that relate to this.

@emlun
Copy link
Member Author

emlun commented Apr 8, 2024

Oh, I see, it does show up in the raw log view. All I saw in the job view was this, so I assumed the whole build was borked:

image

I'll take a look at the errors, then. I've seen "EOF" errors like these caused by the server crashing, so that might be what's happening in these PUT /rest/config/gui calls:

2024-04-07T20:54:39.3042093Z     api_test.go:2983: Put "http://127.0.0.1:62358/rest/config/gui": EOF []
2024-04-07T20:54:39.3043519Z --- FAIL: TestWebauthnConfigChanges/Can_edit_WebauthnCredential.RequireUv (0.23s)

I am not quite sure why we deal with values (value receivers) for config structs in most places, given those are large structs/proto messages with existing/generated pointer receiver methods. And I'd tend to agree with the linter that always using pointer or value receivers for a single type is a good idea, just to remove one possible variation and thus potential complexity/source of confusion.

Alright, if you prefer only pointer receivers I can fix that for the WebauthnCredential methods, and the GUIConfiguration methods added by this PR. To me it seems like a bad idea to grant write permissions to a method that doesn't need it, but if that is idiomatic in Go I can do it (and I also realize that this concern is entangled with the unnecessary copying concern because Go doesn't have read-only references. Sorry if my frustation is showing - the frustration lies with Golang, not with you.).

@acolomb
Copy link
Member

acolomb commented Apr 8, 2024

To me it seems like a bad idea to grant write permissions to a method that doesn't need it, but if that is idiomatic in Go I can do it (and I also realize that this concern is entangled with the unnecessary copying concern because Go doesn't have read-only references. Sorry if my frustation is showing - the frustration lies with Golang, not with you.).

Just a thought: A sensible way to handle this on language / compiler level would be to specify a non-pointer receiver argument, but skip the copying code if the compiler knows for sure that the method does not do any write access. Of course that would have to be conveyed through an attribute on each compiled function, in case the receiver is passed down to a nested method call.

It would convey the inferred "this is not a pointer receiver, thus immutable / read-only" semantics, but still save the actual copying into a new object if not needed. Maybe the Golang compiler is already smart enough to do that?

@imsodin
Copy link
Member

imsodin commented Apr 8, 2024

Alright, if you prefer only pointer receivers I can fix that for the WebauthnCredential methods, and the GUIConfiguration methods added by this PR. To me it seems like a bad idea to grant write permissions to a method that doesn't need it, but if that is idiomatic in Go I can do it (and I also realize that this concern is entangled with the unnecessary copying concern because Go doesn't have read-only references. Sorry if my frustation is showing - the frustration lies with Golang, not with you.).

No need at all to change here. I just couldn't stop myself from commenting. As I said in the beginning, the fact that we already do value receivers for gui config (and many other) make it perfectly fine to do the same here. And I am not even certain on my point. As you correctly identified, with go it's quite often tradeoffs without a clear correct way. I totally sympathise with the idea of using semantic types (values if immutable), that definitely has value (badumm) too.

As to what the compiler does and any further interesting digressions on value vs pointer (sorry for starting that) should probably go somehwere else/the forum. This PR is long enough without it :)

@emlun
Copy link
Member Author

emlun commented Apr 14, 2024

I'll need some help troubleshooting the Windows/Mac failures. It is consistently the TestWebauthnConfigChanges/Can_edit_* tests that fail, most likely because they make config changes and therefore cause the test server to restart, but it's not consistent at all which of those tests are the ones that fail.

They do consistently print logs like this:

2024-04-14T17:44:08.2322532Z     api_test.go:2993: TestWebauthnConfigChanges/Can_edit_WebauthnCredential.RequireUv Put "http://127.0.0.1:65389/rest/config/gui": EOF []
2024-04-14T17:44:08.2326137Z --- FAIL: TestWebauthnConfigChanges/Can_edit_WebauthnCredential.RequireUv (0.26s)

where the EOF piece usually indicates that the server panicked (or for some other reason didn't return a valid HTTP response?), but I'm at a loss trying to figure out why. Sometimes it fails, and sometimes it doesn't. I tried adding some time.Sleeps which didn't seem to make a difference, and also tried adding some debug output which didn't help me much either - the raw logs contain the same number of occurrences of both adjustGui start and adjustGui finish after finish, so it doesn't seem like the adjustGui handler is panicking. It does seem to help somewhat to not run the tests in parallel, but I'm not sure that's reliable either.

All this points to some kind of race conditions or timing issues, but I can't figure out what. I tried to model TestWebauthnConfigChanges after TestConfigChanges which passes reliably, and I fail to see why the former is so much less reliable. As far as I understand, each of the calls to its internal initTest and startHttpServer functions should set up its own server instance on a unique port, so I don't see why these tests should be interfering with each other, if that is the issue.

Might anyone be able to help troubleshoot this?

emlun added 6 commits May 12, 2024 17:57
TestWebauthnConfigChanges seems to need a longer shutdown timeout when running
on GitHub Actions. The problem manifests in errors like this:

```
2024-05-05T17:00:45.0503602Z     api_test.go:2919: TestWebauthnConfigChanges/Can_edit_GUIConfiguration.WebauthnUserId Put "http://127.0.0.1:37585/rest/config/gui": EOF []
2024-05-05T17:00:45.0566336Z --- FAIL: TestWebauthnConfigChanges/Can_edit_GUIConfiguration.WebauthnUserId (0.52s)
```

indicating that the server was forcefully shut down (or panicked, but that was
ruled out in these cases) before the request was fully written.
@emlun
Copy link
Member Author

emlun commented May 12, 2024

Alright, I think I found out why the tests were failing: the server shutdown grace period on config changes was a bit too aggressive at 100 ms. 🙂 I made that grace period configurable in tests (7ed2db8) and increased it to 1 second for TestWebauthnConfigChanges specifically (961720f) (probably needed because it has a lot of parallel sub-tests). That seems to have solved it - now the tests are (as far as I can tell) consistently passing on GitHub Actions! 🎉

As mentioned above, most DeepSource warnings do not originate in this PR, and I was told to ignore those that do (#9175 (comment)).

All that's left now is a PR to the docs repo, which I'm currently working on. This PR is ready for review to merge. 😄

@emlun
Copy link
Member Author

emlun commented May 12, 2024

Hold that thought - the WebAuthn login flow seems to be broken for some reason 🤦 even though it does successfully return a session cookie which is sent in subsequent requests. Might be caused by PR #9425. Investigating...

@emlun
Copy link
Member Author

emlun commented May 12, 2024

Yeah, the WebAuthn credential state update is causing a server restart because it's changing config, which causes the tokenManagers to diverge and lose the newly created token (but curiously the token does apparently get successfully written to disk, because it starts working again after another hard server restart).

I think someone mentioned at some point before that these state variables really should be stored in DB rather than config - this settles it.

@emlun emlun marked this pull request as draft May 12, 2024 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WebAuthn support in GUI
5 participants