Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiDB Cloud Docs: Add documentation for new Recovery Group feature #17425

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

benmeadowcroft
Copy link
Collaborator

@benmeadowcroft benmeadowcroft commented May 8, 2024

What is changed, added or deleted? (Required)

Recovery Groups are a new feature for TiDB Cloud, this PR contains initial documentation for this feature for review by the docs team.

Which TiDB version(s) do your changes apply to? (Required)

  • master (the latest development version)
  • v8.1 (TiDB 8.1 versions)
  • v8.0 (TiDB 8.0 versions)
  • v7.6 (TiDB 7.6 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)
  • v5.3 (TiDB 5.3 versions)
  • v5.2 (TiDB 5.2 versions)
  • v5.1 (TiDB 5.1 versions)

What is the related PR or file link(s)?

  • This PR is translated from:
  • Other reference link(s):

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

benmeadowcroft and others added 2 commits April 23, 2024 17:28
This is a new feature that requires documentation to explain the core concepts to users.

Signed-off-by: Ben Meadowcroft <ben.meadowcroft@pingcap.com>
@ti-chi-bot ti-chi-bot bot added contribution Indicates that the PR was contributed by an external member. missing-translation-status This PR does not have translation status info. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 8, 2024

- Recovery Group: a group of databases that are replicated between two clusters
- Primary Cluster: the cluster where the database is actively written by the application
- Secondary Cluster: the cluster where replicas of the database are located
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remind users that we do not guarantee the readonly of replica databases on secondary, and users need to make ensure they won't write to replica databases on secondary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in da21f75


7. Select which databases you wish to replicate as part of this recovery group.

> **Note**
Copy link
Contributor

@grovecai grovecai May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also remind user here

  1. we are not support replicate system tables to secondary cluster even users select All.
  2. we need to export a lot data from primary and import to secondary at backend during creation, which may have impact on online query for both primary and secondary cluster. Remind users to perform creation operation when traffic is low.
  3. All the selected databases in secondaries will be cleanup before replication establish for data consistency purpose, if the data matter, pls make backup before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in da21f75


# Failover and Reprotect Databases

Databases that are part of a replication group are replicated from one cluster to another (typucally in a different region of the cloud service provider).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: typucally

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in da21f75


If the cluster that was impacted by the disaster is able to be brought online again, a replication relationship from the recovery region back to the original region can be established. This is performed using the **Reprotect** action.

![Unprotected Recovery Group](/media/tidb-cloud/recovery-group/recovery-group-unprotected.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unprotected => Reprotected

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the diagram from before the reprotect operation is performed (so the databases at this point are unprotected). The diagram later in this section is the diagram from after the reprotect operation is performed.


3. On the recovery group page, click the name of the recovery group that you wish to reprotect.

> **Note**
Copy link
Contributor

@grovecai grovecai May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also remind users

  1. we need to export a lot data from primary and import to secondary at backend during reprotecting, which may have impact on online query for both primary and secondary cluster. Remind users to perform creation operation when traffic is low.
  2. All the selected databases in secondaries will be cleanup before replication establish for data consistency purpose, if the data matter, pls make backup before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in da21f75


This document describes how to create a Recovery Group to protect your databases running on TiDB Cloud Dedicated Clusters using the TiDB Cloud user interface. It also shows how to view details of the recovery group.

## Prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After Recovery Group is created, we will create account(in this pattern cloud-rg-*) in the secondary cluster, which will be used to replicated data into secondary cluster, we should remind user not to touch such account. If mistakenly delete such account, it will cause replication interrupted.
I not should where to put this remind message, just comment here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in da21f75


> **Note**
>
> Currently only one resiliency level is supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add explanation for what does resiliency level means for the users? What is the expected behavior for the only supported resiliency level?

Signed-off-by: Ben Meadowcroft <ben.meadowcroft@pingcap.com>
@hfxsd hfxsd self-assigned this May 15, 2024
@hfxsd hfxsd added the translation/no-need No need to translate this PR. label May 15, 2024
@ti-chi-bot ti-chi-bot bot removed the missing-translation-status This PR does not have translation status info. label May 15, 2024
@hfxsd hfxsd added the needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. label May 15, 2024
@hfxsd hfxsd self-requested a review May 15, 2024 08:52
Signed-off-by: Ben Meadowcroft <ben.meadowcroft@pingcap.com>
Copy link

ti-chi-bot bot commented May 20, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from hfxsd, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
TOC-tidb-cloud.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-failover.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-failover.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-get-started.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-get-started.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-get-started.md Outdated Show resolved Hide resolved
@hfxsd hfxsd requested a review from qiancai May 22, 2024 08:31
@hfxsd hfxsd removed the contribution Indicates that the PR was contributed by an external member. label May 22, 2024
Copy link

ti-chi-bot bot commented May 23, 2024

@grovecai: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tidb-cloud/recovery-group-overview.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-overview.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-overview.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-overview.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-overview.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-failover.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-failover.md Outdated Show resolved Hide resolved
Comment on lines 73 to 76
After creating the recovery group, you might want to familiarize yourself with the failover and reprotect operations. These operations are used to **Failover** the primary cluster for the replicated databases from one cluster to the other, and then to later reestablish replication in the opposite direction to **Reprotect** the failed over databases.

- [Failover Databases](/tidb-cloud/recovery-group-failover.md)
- [Failover and Reprotect Databases](/tidb-cloud/recovery-group-failover.md)
Copy link
Collaborator

@qiancai qiancai May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this docs is about Failover and Reprotect, do we need to remove "What's next" from this doc?

tidb-cloud/tidb-cloud-billing-recovery-group.md Outdated Show resolved Hide resolved
tidb-cloud/tidb-cloud-billing-recovery-group.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
@qiancai qiancai added the area/tidb-cloud This PR relates to the area of TiDB Cloud. label May 23, 2024
Co-authored-by: Grace Cai <qqzczy@126.com>
Copy link
Collaborator

@qiancai qiancai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

tidb-cloud/recovery-group-get-started.md Outdated Show resolved Hide resolved
tidb-cloud/recovery-group-delete.md Outdated Show resolved Hide resolved
Copy link

ti-chi-bot bot commented May 23, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-05-23 07:09:17.006398935 +0000 UTC m=+2328310.763534506: ☑️ agreed by hfxsd.
  • 2024-05-23 07:31:04.640999069 +0000 UTC m=+2329618.398134641: ☑️ agreed by qiancai.

Co-authored-by: Grace Cai <qqzczy@126.com>

# Recovery Group Billing

TiDB Cloud bills for recovery groups based on the deployed size of your TiKV nodes in the primary cluster of the recovery group. When you [create a recovery group](/tidb-cloud/recovery-group-get-started.md) for a cluster, you can select the primary cluster for the recovery group. The larger the TiKV configuration, the higher the cost for recovery group protection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add data processing part

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 4eb89d9

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not tell the customer how we make the pricing of data processing - “This data processing cost includes the cross-region and cross-AZ traffic charges”, we just tell them there is a data processing cost charge by GB is enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, updated in 6bd256f

Signed-off-by: Ben Meadowcroft <ben.meadowcroft@pingcap.com>
Signed-off-by: Ben Meadowcroft <ben.meadowcroft@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tidb-cloud This PR relates to the area of TiDB Cloud. lgtm needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. translation/no-need No need to translate this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants