Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC - Add XChaCha20-Poly1305 barrier encryption support #235

Open
cipherboy opened this issue Mar 27, 2024 · 4 comments
Open

RFC - Add XChaCha20-Poly1305 barrier encryption support #235

cipherboy opened this issue Mar 27, 2024 · 4 comments
Labels

Comments

@cipherboy
Copy link
Contributor

Summary

Adding XChaCha20-Poly1305 as a barrier encryption algorithm preserves PQC resistance but removes the unsafe nonce reuse problem of AES-GCM. This proposes adding it as an alternative barrier encryption algorithm for those who prefer avoiding this category of problem.

Problem Statement

AES-GCM is widely deployed due to being one of the more prominent AEAD cipher modes and being implemented in both hardware and NIST's FIPS program. However, as noted in NIST SP-800 38D, it has a flaw: any reuse with the same key and nonce leaks the authentication key allowing message forgery (and potentially lead to plaintext recovery). Thus, NIST suggests no more than 2^32 encryptions with a given key:

The total number of invocations of the authenticated encryption function shall not exceed 2^32, including all IV lengths and all instances of the authenticated encryption function with the given key.

Using AES-GCM-SIV is not sufficient to prevent this problem within OpenBao (though does prevent against accidental nonce reuse), due to the potentially large number of encryptions performed with a single key. Additionally, AES-GCM-SIV is not a FIPS approved algorithm despite requests to standardize it. To avoid this, OpenBao includes automatic counter-based and temporal key rotation of the barrier encryption key.

No FIPS-approved AEAD algorithm avoids this nonce-reuse attack.

However, the XChaCha20-Poly1305 construct avoids this problem by using a 24-byte nonce, extending the 12-byte nonce of ChaCha20-Poly1305. The latter has seen widespread adoption and many libraries (including Go's) seem to have both ChaCha20-Poly1305 and XChaCha20-Poly1305 implementations. While slightly slower than AES-GCM when processor intrinsics support the latter, much effort has gone into optimizing ChaCha20-Poly1305 for TLS. This avoids the need to rely on temporal or counter-based key rotation mechanisms.

Lastly, XChaCha20-Poly1305 support was recently added to Transit, though this does not impact in any way its adoption here.

User-facing description

OpenBao operators will have an additional storage configuration option, barrier_algorithm, to configure the encryption algorithm. Leaving this off of a configuration, using the empty string, or using aes-gcm will use the present default of AES-GCM encryption. Setting this to xchacha20-poly1305 will use the new XChaCha20-Poly1305 construct.

No other change is expected to be visible.

Technical Description

XChaCha20-Poly1305 is an AEAD encryption algorithm, like AES-GCM, so the implementation is fairly easy.

An interface for barrier encryption to implement, SecurityBarrier already exists. Adding new type constants,
config members, and a dynamic dispatcher (to switch between AES-GCM and XChaCha20-Poly1305) should be easier. barrier_aes_gcm.go can then be duplicated to add an XChaCha20-Poly1305 implementation. We'll only support the new keyring format and extend that with algorithm indicators. On upgrade we'll write in the new algorithm for the keyring. For a future in-place migration, we may want to write an algorithm identifier into the keys too, but for now we'll assume a homogeneous keyring of a single type.

Because the key version embedded in the ciphertext is checked prior to doing the decryption (to determine how to unseal the storage entry), we need to choose unique constants across all algorithms. In order to leave space for other algorithms and future updates to existing schemes, the next version identifier 0x11 is used for XChaCha20-Poly1305. An unknown version identifier can thus be used to indicate to the caller the wrong value was chosen in the server configuration.

Rationale and alternatives

It is unlikely to see a true alternative to AES and ChaCha20 in the short term. Besides the PQC competition which focus on asymmetric cryptography (both signing and key establishment), NIST is also running a lightweight cryptography competition. Such lightweight algorithms are likely unsuitable for running the backing storage as many compromise on security margins to aid in lowering power consumption. Further, both AES and ChaCha20 are PQC secure and thus are unlikely to require updates in the near-term.

The previous hash competition that lead to the standardization of SHA-3 resulted in the Threefish algorithm (used in Skein), but this has not seen widespread adoption.

Additionally, while no issues currently exist with AES-GCM (outside of nonce reuse), if there are stronger or alternative symmetric cipher options in the future, this starts the process of being able to introduce and use them.

Future standardization could bless Meta's haberdashery construct that proposes an alternative AES-GCM key schedule that gives longer nonces, removing the underlying assumption. This would warrant another naming and thus wouldn't be exclusive to the work proposed here. If that mechanism achieves standardization, it could be added in parallel.

Downsides

This introduces additional code in the barrier subsystem with minimal refactoring. However, the original subsystem was introduced with extensibility in mind, so this change is fairly easy to implement, minus migration between encryption algorithms which remains unimplemented.

This does open us to potentially fielding questions between the two or to requests to add additional algorithms.

Security Implications

This has a net-positive benefit for security. AES-GCM is a well-known, conservative approach that can be suggested by default, but XChaCha20-Poly1305 is more forgiving and overall safer. Issues with counting encryption requests with a particular key or temporal-based rotations will not negatively impact security like AES-GCM could. Adding another barrier encryption mechanism should be done carefully, though.

User/Developer Experience

No behavior change is expected as a result of this change.

Unresolved Questions

No migration approach has been implemented. This means the only way to move between AES-GCM and XChaCha20-Poly1305 backed instances are to rebuild the instance from scratch. Long-term a migration approach could be considered, though this likely needs careful consideration which involves how hybrid keyrings would be handled. This likely requires some architecture changes, to allow decryption of non-configuration barriers and to add an indication of migration should be performed (versus simply failing).

Related Issues

n/a

Proof of Concept

See the https://github.com/cipherboy/openbao/pull/new/add-xchacha20-barrier branch for a proof of concept.

To use the new algorithm, specify barrier_algorithm = "xchacha20" in the server configuration file prior to initialization.

@cipherboy cipherboy added the rfc label Mar 27, 2024
@DanGhita
Copy link
Contributor

DanGhita commented Apr 4, 2024

Hi @cipherboy,

IMHO, this kind of evolution should be addressed this way:

  • define a default barrier encryption algorithm (xchacha20-poly1305, for ex)
  • allow algorithm specification in the server configuration (for overwriting the default)
  • in the persistent, encrypted data (protected by the barrier), add specific metadata (global header for the whole storage or by entry ?) for signalizing (at least) the algorithm that has been used for encrypting the data.

That would probable break the storage compatibility, but it would allow multiple algorithms (and maybe an easier migration from an algorithm to another).

@cipherboy
Copy link
Contributor Author

Thanks @DanGhita!

  1. Default would remain AES-GCM to retain backwards compatibility, IMO.
  2. Yep, we've got that in this PoC, that's the barrier_algorithm field.
  3. We have this at a keyring level. Because this is a JSON blob with a new field, we'll be able to explicitly upgrade older configurations to the at-the-time default without issues. This would give us (theoretically, in the future with a few more code changes) a homogeneous rotation profile: load the existing keyring verbatim, then perform an explicit rotation of all data to the new encryption algorithm with new keys, swap out the old keyring with the new keyring, and finish initializing.

Concretely, I think, but haven't proven to myself, that this upgrade can be made safe by:

  1. Add key type annotations to each key in the keyring. This can be done later, and loading this would likely require reorganizing the barrier encryption code a little.
  2. Update the keyring type to the new algorithm, creating new keys with this algorithm going forward.
  3. Keeping the existing keys, rotate data encrypted under old keys to be encrypted under new keys.
  4. When all data with a given (old) key has been rewritten, it should now be safe to remove that key.

This could even be done lazily (unless otherwise prompted to be done greedily), allowing a rolling upgrade to be done naturally over time (e.g., when accessing data). In this way, it should be interruptible and gracefully handle multiple transitions between key types.

Care would need to be taken to ensure the keyring does not incorrectly get pruned of keys if data encrypted with that key still exists.

Right now though, I'm not inclined to do this migration. If/when AES-GCM-DNDK is standardized, this would give us enough justification to allow doing this migration in-place (as it retains the FIPS properties of AES-GCM and give us the nice nonce-misuse resistant properties of XChaCha20).

@DanGhita
Copy link
Contributor

DanGhita commented Apr 4, 2024

I'm completely aligned with your thoughts, @cipherboy.
And indeed, having an explicit information about the key type might be necessary.

One last thought: in the old-school languages/approaches, we used to define a versioning byte as the first field of a structure/blob. This versioning information is useful for implementing a specific behavior for a specific version or when updating from an older version. But I'm not sure if this kind of approach is still relevant in Go/JSON world.

@cipherboy
Copy link
Contributor Author

One last thought: in the old-school languages/approaches, we used to define a versioning byte as the first field of a structure/blob. This versioning information is useful for implementing a specific behavior for a specific version or when updating from an older version. But I'm not sure if this kind of approach is still relevant in Go/JSON world.

JSON blobs in Go don't need to be explicitly versioned, there's enough dynamic parsing/serialization that you can freely add new fields or fall back to incremental parsing when you want to change types of fields. :-) We also only (generally) ensure one-way compatibility (old->new) with upgrades. There's no messages being sent between different clients, so we don't really need an explicit versioned field here either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants