Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that binary logs for PITR are in a shared directory #541

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Mar 12, 2024

When executing the vtctldclient RestoreFromBackup --restore-to-pos <value> command, the vttablet process in the vttablet container within the vttablet pod — in the RestoreFromBackup tabletmanager RPC — restores the full backup within the VTDATAROOT (specifically /vt/vtdataroot/vt_<tabletUID>/ for the mysql data) that is shared by all containers within the pod using the configured backup engine (e.g. xtrabackup). It orchestrates that in conjunction with the mysqlctld process that's running inside the mysqld container within the same vttablet pod. In the end there is a running mysqld instance inside the mysqld container that is from the restored full backup. Then once the full backup is in place and the mysqld process is running the vttablet process uses the OS tmp dir of /tmp to restore the binary logs from the backup — via the builtinbackupengine — for subsequent application and /tmp is not a shared mount point within the pod so when mysqlbinlog subsequently tries to read them from within the mysqld container it cannot find them in its container's /tmp directory and it fails with an error.

vtctldclient

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/cmd/vtctldclient/command/backups.go#L227-L263

vtctld[server]

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/vtctl/grpcvtctldserver/server.go#L3260-L3286

vttablet

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/vttablet/tabletmanager/rpc_backup.go#L173-L193

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/vttablet/tabletmanager/restore.go#L191-L273

mysqlctld (rather than mysqlctl, and which runs in the mysql container)

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/mysqlctl/backup.go#L364-L487

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/mysqlctl/builtinbackupengine.go#L995-L1060

vttablet builtinbackupengine

https://github.com/vitessio/vitess/blob/3ae5cf7e690e560dd5630119215bcc3f5ecf31c8/go/vt/mysqlctl/builtinbackupengine.go#L995-L1060

Related issues and PRs:

@mattlord mattlord force-pushed the point-in-time-recovery branch 5 times, most recently from 1824f91 to 0562d49 Compare March 13, 2024 01:15
@mattlord mattlord requested review from shlomi-noach, frouioui and GuptaManan100 and removed request for GuptaManan100 March 13, 2024 01:15
@mattlord mattlord changed the title Ensure that binary logs for PITR are restored to a shared location Ensure that binary logs for PITR are use a shared location Mar 13, 2024
@mattlord mattlord changed the title Ensure that binary logs for PITR are use a shared location Ensure that binary logs for PITR are use a shared directory Mar 13, 2024
@mattlord mattlord changed the title Ensure that binary logs for PITR are use a shared directory Ensure that binary logs for PITR are in a shared directory Mar 13, 2024
Signed-off-by: Matt Lord <mattalord@gmail.com>
Copy link
Collaborator

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Should we also provide the flag in yaml files?

Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord
Copy link
Contributor Author

Nice! Should we also provide the flag in yaml files?

Yeah. I think this does it. e2e5e8b

@shlomi-noach
Copy link
Collaborator

shlomi-noach commented Mar 17, 2024

Yeah. I think this does it.

How is the value being set, and to what specific value?

@mattlord
Copy link
Contributor Author

How is the value being set, and to what specific value?

The user would specify the flag and value in their cluster yaml definition using the extraFlags parameter, just as they do for mysqld flags, e.g. If they don't specify a value then we enforce the default within the operator.

Copy link
Collaborator

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me like it's feature complete and can be taken out of Draft?

@mattlord mattlord marked this pull request as ready for review March 18, 2024 12:18
@shlomi-noach shlomi-noach requested a review from a team March 20, 2024 06:28
@mattlord
Copy link
Contributor Author

How is the value being set, and to what specific value?

The user would specify the flag and value in their cluster yaml definition using the extraFlags parameter, just as they do for mysqld flags, e.g. If they don't specify a value then we enforce the default within the operator.

The flag ended up being for vttablet and vtbackup, not mysqlctld (although vtbackup is a modified mysqlctld). I will leave the mysqlctld extra flags support though as that may come to be useful.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Comment on lines 90 to 94
// Ensure that binary logs are restored to/from a location that all containers
// in the pod can access if no location was explicitly provided.
if _, ok := vttabletAllFlags["builtinbackup-incremental-restore-path"]; !ok {
vttabletAllFlags["builtinbackup-incremental-restore-path"] = vtDataRootPath
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if the path specified in --builtinbackup-incremental-restore-path is not accessible to all containers in the pod?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing it is up to the user to set the same value on all components too? mysqlctl, vttablet and vtbackup

Copy link
Contributor Author

@mattlord mattlord Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same thing that happens now to every user. It doesn't work. PITR does not generally work in the operator today.

@@ -29,6 +29,7 @@ const (
vtRootInitScript = `set -ex
mkdir -p /mnt/vt/bin
cp --no-clobber /vt/bin/mysqlctld /mnt/vt/bin/
cp --no-clobber $(command -v mysqlbinlog) /mnt/vt/bin/ || true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory /mnt/.../ is shared across all the containers in the pod I am assuming? In which case, this line would resolve what you wrote in the PR's description:

/tmp is not a shared mount point within the pod so when mysqlbinlog subsequently tries to read them from within the mysqld container it cannot find them in its container's /tmp directory and it fails with an error

Copy link
Contributor Author

@mattlord mattlord Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simply about copying the mysqlbinlog binary from the vitess/lite container image to the mysqlctld/vtbackup container (if it's not already there), as it looks like we'll need to keep that around in the lite image because the MySQL images do not contain that binary and it's needed for PITR.

Comment on lines +318 to +329
// MysqlctldSpec configures the local mysqlctld gRPC server within a tablet.
type MysqlctldSpec struct {
// Resources specify the compute resources to allocate for just the MySQL Control Daemon.
Resources corev1.ResourceRequirements `json:"resources"`

// ExtraFlags can optionally be used to override default flags set by the
// operator, or pass additional flags to mysqlctld. All entries must be
// key-value string pairs of the form "flag": "value". The flag name should
// not have any prefix (just "flag", not "-flag"). To set a boolean flag,
// set the string value to either "true" or "false".
ExtraFlags map[string]string `json:"extraFlags,omitempty"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I am missing something, this does not seem to be used anywhere. I think we should remove it to avoid unrequired changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it if it indeed is not needed after I do some testing to confirm that things are now working.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back on what I said. I think it would be nice to keep that around so that people can use custom mysqlctld flags.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord force-pushed the point-in-time-recovery branch 4 times, most recently from 4abb9ce to a2d80d2 Compare March 27, 2024 21:45
Signed-off-by: Matt Lord <mattalord@gmail.com>
mattlord and others added 2 commits March 27, 2024 23:26
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
shlomi-noach and others added 7 commits March 28, 2024 08:26
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
This includes mysqld (of course) and mysqlbinlog

But it does NOT include xtrabackup

Signed-off-by: Matt Lord <mattalord@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants