Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/influxdb] Allow separate persistence cfg for backups #24586

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kevin-delisle-peakpower

I'm back! I missed the approval of the PR last time (was bogged down in other work and it slipped my mind), so I've rebased it, and tidied it up a bit.

Description of the changes

  • Add an optional configuration section under backup that allows the specification of distinct values for persistence. If this configuration section is not provided, the template will default to rendering using the existing top-level persistence configuration section to maintain compatibility.

Benefits

Users can now define volumes for the Deployment and their backups separately without needing Kustomize or manual post-deployment hacks.

Example usage could include:

  • using different StorageClasses for their backup volumes vs. their active database volumes.
  • using a larger volume for backups vs. the main database volume to accommodate multiple full backups

Possible drawbacks

There should not be any drawbacks to this PR; I have attempted to maintain backwards compatibility with existing use cases in the event that someone provides a values.yaml file that does not include the new section.

Additional information

I've dumped the template out using permutations of both persistence modes to confirm that the backup.persistence section will take precedence over persistence for the backup volume, as well as to confirm that the templates still correctly render in the complete absence of that configuration section.

Checklist

  • Chart version bumped in Chart.yaml according to semver. This is not necessary when the changes only affect README.md files.
  • Variables are documented in the values.yaml and added to the README.md using readme-generator-for-helm
  • Title of the pull request follows this pattern [bitnami/<name_of_the_chart>] Descriptive title
  • All commits signed off and in agreement of Developer Certificate of Origin (DCO)

@github-actions github-actions bot added influxdb triage Triage is needed labels Mar 20, 2024
@github-actions github-actions bot requested a review from carrodher March 20, 2024 21:02
@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch from e10dcaf to f55bf93 Compare March 20, 2024 21:03
@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch from f55bf93 to b627d37 Compare March 20, 2024 21:13
@carrodher
Copy link
Member

carrodher commented Mar 21, 2024

Hi, thanks for your contribution. Could you please grant access for bitnami-bot into your fork? It is needed for some automation to automatically update the README and CRDs if needed:

remote: Permission to peakpowerenergy/bitnami-charts.git denied to bitnami-bot.

@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch from b627d37 to c75d090 Compare March 21, 2024 15:03
@kevin-delisle-peakpower
Copy link
Author

@carrodher I've sent the invite with Write access to bitnami-bot.

@bitnami-bot bitnami-bot added verify Execute verification workflow for these changes in-progress labels Mar 22, 2024
@carrodher carrodher added in-progress and removed in-progress triage Triage is needed labels Mar 25, 2024
@github-actions github-actions bot removed the request for review from carrodher March 25, 2024 07:34
@kevin-delisle-peakpower
Copy link
Author

@andresbono Any word on whether or not we can merge this? I don't want it to get closed due to inactivity like last time 😄

Copy link
Member

@andresbono andresbono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! Could you please check my comments?

bitnami/influxdb/templates/pvc-backup.yaml Outdated Show resolved Hide resolved
bitnami/influxdb/templates/_helpers.tpl Outdated Show resolved Hide resolved
@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch 2 times, most recently from a3b09b2 to e07ab8b Compare April 11, 2024 16:37
@kevin-delisle-peakpower
Copy link
Author

kevin-delisle-peakpower commented Apr 11, 2024

Coalesce seems to do the trick, I've refactored the PR to clean up the mess I made in _helpers.tpl 😄

I ran through a few scenarios to evaluate the output of the chart:

# Should create no PVCs, use the same volume claim for both (still not sure if this makes sense)
helm template test . -f values.yaml --set backup.enabled=true --set backup.persistence.enabled=false --set persistence.existingClaim=foo

# Should create one PVC for the backup
helm template test . -f values.yaml --set backup.enabled=true --set backup.persistence.enabled=true --set persistence.existingClaim=foo

# Should create one PVC for the database
helm template test . -f values.yaml --set backup.enabled=true --set backup.persistence.enabled=true --set backup.persistence.existingClaim=foo --set persistence.size=16Gi

# Should create one PVC for the backup with size=16Gi
helm template test . -f values.yaml --set backup.enabled=true --set backup.persistence.enabled=true --set persistence.enabled=true --set persistence.existingClaim=foo --set backup.persistence.size=16Gi

# Should create no PVCs, reference existing claims for both
helm template test . -f values.yaml --set backup.enabled=true --set backup.persistence.enabled=true --set persistence.existingClaim=foo  --set backup.persistence.existingClaim=bar

# Should only use foo and ignore bar (since backup.persistence.enabled=false)
helm template test . -f values.yaml --set backup.enabled=true --set backup.persistence.enabled=false --set persistence.existingClaim=foo  --set backup.persistence.existingClaim=bar

I am still worried that the default behaviour for many (most?) consumers using this chart would be to have the database's PVC set to ReadWriteOnce mode, which would prevent the CronJob from ever launching.

Ultimately, if the Bitnami team thinks its okay to swap the CronJob to using a separate persistent volume as a default for backups, I can change the default value of backup.persistence.enabled back to true. You could argue that it's a breaking change, though, since anyone who's been consuming this chart might find that any additional scripts they've written will now need to run on a different target.

@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch 2 times, most recently from 0448d6d to c604d8b Compare April 15, 2024 17:46
Copy link
Member

@andresbono andresbono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great now! Thank you so much. Could you check my suggestions and run your helm template scenarios again?

The main point of my suggestions is to introduce a new different parameter backup.persistence.ownConfig to have control on what set of persistence parameters should be applied. I consider it is important to have a different one than backup.persistence.enabled to cover the case in which you want persistence for the database but not for the backup for example.

As we are dealing with application persisted data here, we should be extra safe :)

Comment on lines 21 to 25
- name: common
repository: oci://registry-1.docker.io/bitnamicharts
tags:
- bitnami-common
version: 2.x.x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: common
repository: oci://registry-1.docker.io/bitnamicharts
tags:
- bitnami-common
version: 2.x.x
- name: common
repository: oci://registry-1.docker.io/bitnamicharts
tags:
- bitnami-common
version: 2.x.x

Comment on lines 30 to 33
- influxdb
- tick
- database
- timeseries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- influxdb
- tick
- database
- timeseries
- influxdb
- tick
- database
- timeseries

Comment on lines 35 to 36
- name: VMware, Inc.
url: https://github.com/bitnami/charts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: VMware, Inc.
url: https://github.com/bitnami/charts
- name: VMware, Inc.
url: https://github.com/bitnami/charts

name: influxdb
sources:
- https://github.com/bitnami/charts/tree/main/bitnami/influxdb
version: 6.0.6
- https://github.com/bitnami/charts/tree/main/bitnami/influxdb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- https://github.com/bitnami/charts/tree/main/bitnami/influxdb
- https://github.com/bitnami/charts/tree/main/bitnami/influxdb

The YAML is still valid with both levels of indentation but it will be reverted in the next automatic update causing unnecessary commit changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my YAML formatting plugin has been aggressively asserting itself. I'll change them back

@@ -129,6 +129,8 @@ As an alternative, you can use of the preset configurations for pod affinity, po
The data is persisted by default using PVC(s). You can disable the persistence setting the `persistence.enabled` parameter to `false`.
A default `StorageClass` is needed in the Kubernetes cluster to dynamically provision the volumes. Specify another StorageClass in the `persistence.storageClass` or set `persistence.existingClaim` if you have already existing persistent volumes to use.

If you would like to define persistence settings for a backup volume that differ from the persistence settings for the database volume, you may do so under the `backup.persistence` section of the configuration. If this section is undefined, but `backup.enabled` is set to true, the backup volume will be defined using the `persistence` parameter section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you would like to define persistence settings for a backup volume that differ from the persistence settings for the database volume, you may do so under the `backup.persistence` section of the configuration. If this section is undefined, but `backup.enabled` is set to true, the backup volume will be defined using the `persistence` parameter section.
If you would like to define persistence settings for a backup volume that differ from the persistence settings for the database volume, you may do so under the `backup.persistence` section of the configuration by setting `backup.persistence.ownConfig` to `true`. The backup volume will otherwise be defined using the `persistence` parameter section.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: if someone doesn't set backup.persistence.ownConfig: true, and they have persistence.existingClaim set to some volume, should the effective value of backup.persistence.existingClaim be the same? That would lead to binding of the same target volume, which is the RWO problem I mentioned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone has backup.persistence.ownConfig: false (which is the proposed default, at least on this first version) the chart will have the very same behavior than the current version and any backup.persistence.* parameter will be ignored.

Any bugs present in the current version will still be there unless you explicitly set backup.persistence.ownConfig: true. Then, you have all the flexibility you want, right?

Comment on lines +5 to +12
{{-
/*
Prefer .Values.backup.persistence, but fall back to .Values.persistence if not present.
*/
-}}
{{- if .Values.backup.enabled }}
{{ $persistence := coalesce .Values.backup.persistence .Values.persistence }}
{{ if and $persistence.enabled (not $persistence.existingClaim) }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{-
/*
Prefer .Values.backup.persistence, but fall back to .Values.persistence if not present.
*/
-}}
{{- if .Values.backup.enabled }}
{{ $persistence := coalesce .Values.backup.persistence .Values.persistence }}
{{ if and $persistence.enabled (not $persistence.existingClaim) }}
{{- $persistence := ternary .Values.backup.persistence .Values.persistence .Values.backup.persistence.ownConfig }}
{{- if and .Values.backup.enabled $persistence.enabled (not $persistence.existingClaim) }}

Comment on lines +16 to +18
name: {{ include "common.names.fullname" $ }}-backups
namespace: {{ include "common.names.namespace" $ | quote }}
labels: {{- include "common.labels.standard" ( dict "customLabels" $.Values.commonLabels "context" $ ) | nindent 4 }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the original contexts

Suggested change
name: {{ include "common.names.fullname" $ }}-backups
namespace: {{ include "common.names.namespace" $ | quote }}
labels: {{- include "common.labels.standard" ( dict "customLabels" $.Values.commonLabels "context" $ ) | nindent 4 }}
name: {{ include "common.names.fullname" . }}-backups
namespace: {{ include "common.names.namespace" . | quote }}
labels: {{- include "common.labels.standard" ( dict "customLabels" .Values.commonLabels "context" $ ) | nindent 4 }}

Comment on lines +20 to +21
{{- if or $persistence.annotations $.Values.commonAnnotations }}
{{- $annotations := include "common.tplvalues.merge" ( dict "values" ( list $persistence.annotations $.Values.commonAnnotations ) "context" $ ) }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{- if or $persistence.annotations $.Values.commonAnnotations }}
{{- $annotations := include "common.tplvalues.merge" ( dict "values" ( list $persistence.annotations $.Values.commonAnnotations ) "context" $ ) }}
{{- if or $persistence.annotations .Values.commonAnnotations }}
{{- $annotations := include "common.tplvalues.merge" ( dict "values" ( list $persistence.annotations .Values.commonAnnotations ) "context" . ) }}

persistence:
## @param backup.persistence.enabled Enable data persistence for backup volume
##
enabled: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enabled: false
enabled: true


## Persistence parameters
##
persistence:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
persistence:
persistence:
## @param backup.persistence.ownConfig Prefer independent own persistence parameters to configure the backup volume
## When set to `false` (for backwards compatibility), the rest of the persistence parameters below will be ignored.
## This parameter will be set to `true` and removed in a future release.
##
ownConfig: false

@kevin-delisle-peakpower
Copy link
Author

kevin-delisle-peakpower commented Apr 17, 2024

The main point of my suggestions is to introduce a new different parameter backup.persistence.ownConfig to have control on what set of persistence parameters should be applied. I consider it is important to have a different one than backup.persistence.enabled to cover the case in which you want persistence for the database but not for the backup for example.

As we are dealing with application persisted data here, we should be extra safe :)

I'm a bit confused; if someone wants persistence for the database and not the backup, they would set persistence.enabled: true (which is the default) and backup.persistence.enabled: false. If someone decides to use backup.persistence.enabled: true, the templates for the backup use the values/default values under the backup.persistence section, not the persistence section (or at least, that's what I was aiming for).

If I'm missing the point, could you give me some contrived values.yaml examples to help me see the issue?

EDIT: Scratch that, I think I get what you're going for here. This is mostly to avoid duplicating the default arguments twice if someone just wants two volumes that fit an identical set of values (or the default).

I know that with Helm charts, you want sane defaults that leave the deployment in a stable, working state that's usable, but that's sort of what I'm hung up on. I do agree that something like this would help prevent scenarios like upgrading an installation that uses a 1 Ti volume with a separate backup volume set to 8Gi (and similar issues), but it would also create issues for anyone using persistence.existingClaim with ReadWriteOnce volumes by default as well.

As far as I can tell, avoiding either one of these pitfalls requires providing unique configuration for the backup volume in the values.yaml used for the chart on deployment (either providing backup.persistence.existingClaim to avoid collision with the database's claim, or providing configuration underneath the backup.persistence section to ensure it meets your needs in terms of size, storage class, etc.).

I think having the backups as a separate volume with separate, non-identical configuration is a more reasonable default that reflects the realistic expectation that multiple copies of the target database may end up needing more than that database volume's storage capacity. If someone wants to use the same volume for both on a new deployment, they can provide identical targets for persistence.existingClaim and backup.persistence.existingClaim. Alternatively, if they're upgrading a deployment and wish to use the same volume, they can specify the generated name of the existing database volume in backup.persistence.existingClaim to do so.

Regardless, I think this makes a strong argument for a major version bump of the chart, and a step in an upgrade guide to avoid whichever pitfall is created by the approach we settle on.

Side-note: I do think that if we were to keep distinct values for the backup.persistence section, that it would also make sense to get rid of backup.persistence.enabled, but under the assumption that if backup.enabled is true, that it implies the existing of a separate volume (and backup.persistence.existingClaim would be your way of changing that behaviour).

@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch from 93d634f to 5b7b7b5 Compare April 17, 2024 19:29
@andresbono
Copy link
Member

but it would also create issues for anyone using persistence.existingClaim with ReadWriteOnce volumes by default as well.

I feel like I still don't fully understand the concern here. What if you give me an example of the values.yaml that could cause trouble?

Regardless, I think this makes a strong argument for a major version bump of the chart, and a step in an upgrade guide to avoid whichever pitfall is created by the approach we settle on.

I would do the change incrementally. We can go ahead with the PR and backup.persistence.ownConfig: false by default to be safe and then in the next major version (maybe grouped with other changes), we get rid of the backup.persistence.ownConfig flag and we change the default behavior. ➕ to the upgrade guide. We use to do that as part of the major upgrading notes in the README.

Does it make sense to you?

@kevin-delisle-peakpower
Copy link
Author

kevin-delisle-peakpower commented Apr 26, 2024

@andresbono Sorry, I've been really busy all of this week with work, but I haven't forgotten about this PR.

My concerns about ReadWriteOnce are probably misplaced. Technically, as long as the node the volume is bound to has compute and memory available for the CronJob (and has the right tolerations), this should still work.

I was mostly worried because I've previously seen behaviour in which certain drivers treat ReadWriteOnce in the way the newer ReadWriteOncePod feature is intended to work: bottlerocket-os/bottlerocket#2417 (comment)

When I was testing this out myself (EKS v1.23 with aws-ebs-csi for storage), the cluster was throwing VolumeAttachment errors. I suspect it wasn't scheduling the CronJob on the same node as the Deployment, but that's an issue with something outside of the scope of this chart.

@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch from 89571cd to 29062a5 Compare April 26, 2024 20:05
@andresbono
Copy link
Member

@andresbono Sorry, I've been really busy all of this week with work, but I haven't forgotten about this PR.

No problem at all! We totally understand.

Copy link

This Pull Request has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thank you for your contribution.

@github-actions github-actions bot added the stale 15 days without activity label May 15, 2024
Add an optional configuration section under backup that allows
the specification of distinct values for persistence.

If this configuration section is not provided, the template will
default to rendering using the existing top-level persistence
configuration section to maintain compatibility.

Signed-off-by: Kevin Delisle <kjdelisle@gmail.com>
@kevin-delisle-peakpower kevin-delisle-peakpower force-pushed the feature/influxdb-separate-persistence-options branch from 88b2783 to b05670b Compare May 15, 2024 15:04
Signed-off-by: Bitnami Containers <bitnami-bot@vmware.com>
@andresbono
Copy link
Member

Hi @kevin-delisle-peakpower, if you prefer, you can grant me access to your fork and I can help pushing the proposed suggestions. Let me know if you prefer that. Thank you!

@kevin-delisle-peakpower
Copy link
Author

kevin-delisle-peakpower commented May 16, 2024

Honestly, I would love that. I know this has been taking too long and I've been swamped with a bunch of OIDC and AWS IAM refactoring work. I've granted you write access to the repo.

Thank you 🙏

Copy link

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Pull Request. Do not hesitate to reopen it later if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-progress influxdb solved stale 15 days without activity verify Execute verification workflow for these changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants