Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a DR overview focused on resiliency with comparison for HA & DR #18490

Merged
merged 4 commits into from
May 28, 2024

Conversation

kathancox
Copy link
Contributor

@kathancox kathancox commented Apr 18, 2024

Fixes DOC-9928, DOC-9929

This PR (in draft) adds a DR overview page to direct users toward establishing resiliency in their deployments. Currently included this as an overview page for DR page, but there are other options.

Rendered preview

Copy link

github-actions bot commented Apr 18, 2024

Files changed:

Copy link

netlify bot commented Apr 18, 2024

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit b562667
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/665629b5efb16b0007bf550c

Copy link

netlify bot commented Apr 18, 2024

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit b562667
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-api-docs/deploys/665629b62bd57b0008cdd01c

Copy link

netlify bot commented Apr 18, 2024

Netlify Preview

Name Link
🔨 Latest commit b562667
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-docs/deploys/665629b574e01700080c3fb5
😎 Deploy Preview https://deploy-preview-18490--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kathancox kathancox force-pushed the dr-resiliency-comp-overview branch from 3c7eeca to 52ebe0f Compare May 21, 2024 15:24
@kathancox kathancox marked this pull request as ready for review May 21, 2024 15:25
@kathancox kathancox requested a review from alicia-l2 May 21, 2024 15:26
Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments/proposed edits, thanks!

Resilient deployments aim for continuity in database operation to protect from data loss and down time. To maintain resiliency, it is necessary to build deployments with _high availability_ and _disaster recovery_ coverage.

- [High availability](#choose-a-high-availability-strategy): Continuous and uninterrupted access to data even in the presence of failures or disruptions to maximize uptime.
- [Disaster recovery](#choose-a-disaster-recovery-strategy): Recovery from a major incident or disaster to minimize downtime and data loss.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recover instead of Recovery?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, like it. To match the other bullet point, I have changed so they are both verbs.


As you evaluate CockroachDB's disaster recovery features, consider your organization's requirements for the amount of tolerable data loss and the acceptable length of time to recover.

- Recovery Point Objective (RPO): The maximum amount of time that an organization can tolerate losing data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"maximum amount of data loss – as measured by time – that an organization can tolerate." Maybe this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, with parenthesis.

<table class="comparison-chart">
<tr>
<th></th>
<th>Single-region replication</th>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding "synchronous" replication here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to both columns here.

<b>Fault tolerance</b>
</td>
<td>Zero RPO node, availability zone failures</td>
<td>Zero RPO node, availability zone failures</td>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multi-region one should also be able to survive a region failure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

<b>Fault tolerance</b>
</td>
<td>Not applicable</td>
<td>Zero RPO node, availability zone region failure with loss up to RPO</td>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think a comma is needed after 'availability zone'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, looks like it! Added the comma!

@kathancox kathancox requested a review from alicia-l2 May 23, 2024 14:14
Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few more comments!


As you evaluate CockroachDB's disaster recovery features, consider your organization's requirements for the amount of tolerable data loss and the acceptable length of time to recover.

- Recovery Point Objective (RPO): The maximum amount of data loss (measured by time) that an organization can tolerate losing data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "losing data" should be removed?

toc: true
---

Resilient deployments aim for continuity in database operation to protect from data loss and down time. To maintain resiliency, it is necessary to build deployments with _high availability_ and _disaster recovery_ coverage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

database operation continuity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't changed this. Having "database operation" modify "continuity" feels a little harder to read. I have left as-is for now — hopefully my docs review partner may have an idea here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, sounds good!

Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

Copy link
Contributor

@Amruta-Ranade Amruta-Ranade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, but overall LGTM!

src/current/v24.1/disaster-recovery-overview.md Outdated Show resolved Hide resolved
src/current/v24.1/disaster-recovery-overview.md Outdated Show resolved Hide resolved
src/current/v24.1/disaster-recovery-overview.md Outdated Show resolved Hide resolved
src/current/v24.1/disaster-recovery-overview.md Outdated Show resolved Hide resolved
kathancox and others added 4 commits May 28, 2024 15:00
Add overview with comparative strategies for DR & HA
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
@kathancox kathancox force-pushed the dr-resiliency-comp-overview branch from 83cc45f to b562667 Compare May 28, 2024 19:00
@kathancox
Copy link
Contributor Author

TFTRs!

@kathancox kathancox merged commit 35ed838 into main May 28, 2024
6 checks passed
@kathancox kathancox deleted the dr-resiliency-comp-overview branch May 28, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants