Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Vault autounseal #365

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

feat: Vault autounseal #365

wants to merge 12 commits into from

Conversation

DanielArndt
Copy link
Member

@DanielArndt DanielArndt commented May 10, 2024

Adds the vault-autounseal interface to the vault-k8s charm which allows any vault-k8s app to integrate with any other app that supports the vault-autounseal interface (just vault-k8s, for now). This enables autounseal for the requirer side of the relation.

Caveats / Consequences

The biggest thing to note here, and we can discuss further if necessary, is that there is no way to migrate from autounseal back to manual unseal in this implementation.

  • Manual unseal -> Auto unseal works as expected
  • Auto unseal -> Manual unseal works only if you didn't run the command to migrate to auto-unseal. i.e. you integrated the two apps, but maybe you made a mistake so you remove the relation without running vault unseal -migrate ...

The reason this wasn't straightforward to implement is because to remove auto-unseal, you need to

  1. Add a disabled = true key-value to the transit stanza
  2. Reboot vault
  3. Run the vault unseal -migrate commands
  4. (optional): remove the transit stanza

In order to implement this we would need to create a new action that stores some state in the charm so that it knows to keep the disabled = true key-value in the config. We can't do this after the relation is removed, and we can't do this while the relation is being removed (because we would need to stop and wait for the user to do something).

For now, we keep the auto-unseal key so that it could theoretically be restored, although there is no process to do this yet.

Usage / Testing

Deploy two vaults:

juju deploy ./vault-k8s_ubuntu-22.04-amd64.charm vault-a --trust -n 1 --resource=vault-image=ghcr.io/canonical/vault:1.15.6
juju deploy ./vault-k8s_ubuntu-22.04-amd64.charm vault-b --trust -n 1 --resource=vault-image=ghcr.io/canonical/vault:1.15.6

Initialize and unseal one of the vaults

vault_app_ip=...
VAULT_ADDR="https://${vault_app_ip}:8200"
export VAULT_ADDR
vault operator init | tee vault-a-init.txt
export VAULT_TOKEN=$(awk '/Root Token/ { print $4}' vault-a-init.txt)
head vault-a-init.txt -n3 | cut -d' ' -f 4 | xargs -L1 vault operator unseal

Authorize the Vault charm

juju run vault-a/leader authorize-charm token="$(awk '/Root Token/ { print $4}' vault-a-init.txt)"

Integrate the two vaults

juju integrate vault-a:vault-autounseal-provides vault-b:vault-autounseal-requires

Initialize vault-b

vault_app_ip=...
VAULT_ADDR="https://${vault_app_ip}:8200"
export VAULT_ADDR
vault operator init | tee vault-b-init.txt

Scale vault-b and watch it autounseal

juju scale-application vault-b 2

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that validate the behaviour of the software
  • I validated that new and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@DanielArndt DanielArndt changed the title Vault autounseal feat: Vault autounseal May 10, 2024
This is how the other parts are written. Ideally, this could be run independently, but that would require a refactor.
renamed fixtures to represent the state, not the action. This freed up the `initialize_vault_leader` name for the action.
@DanielArndt DanielArndt marked this pull request as ready for review May 13, 2024 10:21
@DanielArndt DanielArndt requested a review from a team as a code owner May 13, 2024 10:21
@gruyaume

This comment was marked as resolved.

Copy link
Collaborator

@gruyaume gruyaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review. Overall I think this looks good, I got some questions/concerns around the token renewal and expiry process, the fact that we ask users to unseal their Vault B plus a couple of nits around typing and docstrings.

charmcraft.yaml Outdated Show resolved Hide resolved
lib/charms/vault_k8s/v0/vault_autounseal.py Outdated Show resolved Hide resolved
lib/charms/vault_k8s/v0/vault_client.py Show resolved Hide resolved
lib/charms/vault_k8s/v0/vault_tls.py Outdated Show resolved Hide resolved
lib/charms/vault_k8s/v0/vault_client.py Outdated Show resolved Hide resolved
src/charm.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
lib/charms/vault_k8s/v0/vault_client.py Outdated Show resolved Hide resolved
lib/charms/vault_k8s/v0/vault_client.py Outdated Show resolved Hide resolved
Comment on lines +451 to +457
def _get_autounseal_policy_name(self, relation_id: int) -> str:
"""Return the policy name for the given relation id."""
return f"charm-autounseal-{relation_id}"

def _get_autounseal_approle_name(self, relation_id: int) -> str:
"""Return the approle name for the given relation id."""
return f"charm-autounseal-{relation_id}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do those helpers belong in this lib?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the alternative would be to make the charm pass in the name of each of them (key name, policy name, approle name)? I felt like this kept the interface a little cleaner, and since it is an internal lib, we're not beholden to it.

Do you think it makes more sense to pass each one of these in as a parameter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this, like you said, this is only be used by the charms so it's not so bad if it's charm specific.

Copy link
Contributor

@kayra1 kayra1 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this belongs in the autounseal library. KV also has its own format for approles, and it doesn't make sense to have both of them here when it's just KV and autounseal that will be using these functions.

We should put functions that are expected to be used by more than 1 dependent in this lib IMO.

@gruyaume gruyaume requested a review from a team May 13, 2024 11:42
@DanielArndt

This comment was marked as resolved.

Comment on lines +451 to +457
def _get_autounseal_policy_name(self, relation_id: int) -> str:
"""Return the policy name for the given relation id."""
return f"charm-autounseal-{relation_id}"

def _get_autounseal_approle_name(self, relation_id: int) -> str:
"""Return the approle name for the given relation id."""
return f"charm-autounseal-{relation_id}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this, like you said, this is only be used by the charms so it's not so bad if it's charm specific.

Comment on lines +461 to +469
mount_policy = textwrap.dedent(f"""
path "{mount}/encrypt/{key_name}" {{
capabilities = ["update"]
}}

path "{mount}/decrypt/{key_name}" {{
capabilities = ["update"]
}}
""")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we using a file/template like we do for the other hcl content?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, somehow I got in my head this is how we dealt with the dynamic policies and I was just trying to be consistent. I can see now that isn't the case. Weird. Brain glitch. I'll fix.

@@ -948,13 +1080,59 @@ def _generate_vault_config_file(self) -> None:
node_id=self._node_id,
retry_joins=retry_joins,
)
content = self._add_transit_stanza_if_needed(content)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have this outside of the render_vault_config_file? If we do this for the transit stanza, why not for the other stanzas? Wouldn't it be cleaner to have 1 approach to managing this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it probably does belong there.

key_name = "{{ key_name }}"
mount_path = "{{ mount_path }}"
token = "{{ vault_token }}"
{% if tls_ca_cert %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases where we won't have a ca cert for this?

)
return config_content

def _get_transit_requirer_relation(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return typing is missing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VaultAutounsealDestroy is probably not a great event name, what happened is not that vault autounseal was destroyed, what happened is that the relation to the vault autounseal requirer was broken.

Now what the Vault autounseal provider does when this event occurs is someting else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this as a guide to naming these things: https://juju.is/docs/juju/removing-things#heading--removal-terms

)

def _on_vault_autounseal_destroy(self, event: VaultAutounsealDestroy):
"""Handle the vault-autounseal-destroy event."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the conversation about docstrings, there are likely a couple of them that could be improved and others that could be removed.

Copy link
Contributor

@kayra1 kayra1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully onboard with the way these changes are structured, and I've made the comments below, but I also think this is a very comprehensive change that is hard to get perfect so I'm ok with not requesting changes and iteratively improving it in the future.

Comment on lines +477 to +479
def _get_autounseal_key_name(self, relation_id: int) -> str:
"""Return the key name for the given relation id."""
return str(relation_id)
Copy link
Contributor

@kayra1 kayra1 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

functions like this don't really make sense to have in a client library imo. It could make sense in the context of autounseal, but for someone reading this library, it just looks like extra words for str()

"""Return the approle name for the given relation id."""
return f"charm-autounseal-{relation_id}"

def _configure_autounseal_policy(self, mount: str, relation_id: int, key_name: str) -> str:
Copy link
Contributor

@kayra1 kayra1 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have both _configure_autounseal_policy and _configure_policy.
We should either have a generic function called _configure_policy that every consumer customizes for itself in its own file, or 3 functions called:

  • _configure_autounseal_policy,
  • _configure_charm_policy
  • _configure_kv_policy,
    but not both.

"""Return the key name for the given relation id."""
return str(relation_id)

def _create_autounseal_key(self, mount_point: str, relation_id: int) -> str:
Copy link
Contributor

@kayra1 kayra1 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

functions that do something to the client make sense to have in this file, but the naming doesn't make sense to me.

The function name is create_autounseal_key, but the function itself is creating a transit key. As far as vault is concerned, the autounseal function isn't even mentioned in the transit secrets engine page.

To me, this is not good separation of concerns.

What I would prefer is to name this function create_transit_key, and in the autounseal library a function called create_unseal_key that gets the autounseal key name and calls create_transit_key.

"""Destroy the autounseal key."""
self._client.secrets.transit.delete_key(mount_point=mount_point, name=key_name)

def destroy_autounseal_credentials(self, relation_id: int, mount: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the changes I suggested above, this function could be moved to the autounseal library, and each of the direct _client calls here should be a public function from vault_client.

This applies to most functions with autounseal in their names.

if self._vault_service_is_running():
self._container.restart(self._service_name)

def _add_transit_stanza_if_needed(self, config_content: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this function is doing a bit more than just adding new config options to the config content it receives, it basically is doing all the configuration that is needed for the autounseal requirer, for example pushing the CA cert and authenticating agains the provider vault.

existing_content = ""
if self._container.exists(path=VAULT_CONFIG_FILE_PATH):
existing_content_stringio = self._container.pull(path=VAULT_CONFIG_FILE_PATH)
existing_content = existing_content_stringio.read()

if not config_file_content_matches(existing_content=existing_content, new_content=content):
self._push_config_file_to_workload(content=content)
if _contains_transit_stanza(existing_content) != _contains_transit_stanza(content):
if self._vault_service_is_running():
self._container.restart(self._service_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the restart needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why this does, but a change in the config file not related to the transit stanza doesn't?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants