Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DOC +) Add resolutions to Allocation Explain examples #108263

Merged
merged 4 commits into from May 22, 2024

Conversation

anniegale9538
Copy link
Contributor

👋 hi, team! This adds a couple resolution steps to the Allocation Explain examples, that I think could be useful for our customers!

Let me know what you think!

@anniegale9538 anniegale9538 added >enhancement >docs General docs changes :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. labels May 3, 2024
Copy link

github-actions bot commented May 3, 2024

Documentation preview:

@elasticsearchmachine
Copy link
Collaborator

@anniegale9538 please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

@elasticsearchmachine elasticsearchmachine added v8.15.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels May 3, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

docs/reference/cluster/allocation-explain.asciidoc Outdated Show resolved Hide resolved
@@ -180,6 +180,8 @@ primary shard that was previously allocated.
----
// NOTCONSOLE

The shard in the example above errors `no_valid_shard_copy` due to `NODE_LEFT`. The first step to recover is to make sure all nodes are in the cluster. Then, if the error continues after a <<cluster-reroute,cluster reroute>>, the data will need to be <<snapshots-restore-snapshot,restored from snapshot>>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this info is valuable, but I don't think that it's the appropriate location for the recovery steps. We have a page that talks about recovering lost nodes - we might consider linking to that doc, and enhancing it as needed. This way, all of the info is centralized. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋🏽 I'm on board to either page as long as we can write an actionable commentary for this response example. (I helped Annie write her first doc PR!) The Support back story is that users search for the explicit error and no_valid_shard_copy only shows on this page.

We could just link or also add the error's literal string on the recovery data for a lost primary shard. If we cross-link/pollinate, could y'all also kindly consider reformatting the destination so these don't look like sequential steps (which'd destroy your data right after recovering it?):
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefnestor this is probably my unfamiliarity with this area of the product, but can you explain which steps shouldn't be performed sequentially? using the cluster reroute API replacing the primary with a replica?

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
@anniegale9538
Copy link
Contributor Author

@DaveCTurner @shainaraskas @stefnestor I trust whatever y'all think is best practice! I'm still new to Elastic and just excited to be here and to learn to contribute in a way that will benefits our customers the most. If referencing existing documentation is preferred, I am more than okay with that!

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
@anniegale9538
Copy link
Contributor Author

@shainaraskas sorry for the delay, loved the tip options, I think it's ready for you to re-review! Thanks for your patience as I'm learning!

@shainaraskas
Copy link
Contributor

@anniegale9538 please wait for another review from Dave before you merge :)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM one optional suggestion but no need for me to re-review either way.

docs/reference/cluster/allocation-explain.asciidoc Outdated Show resolved Hide resolved
Co-authored-by: David Turner <david.turner@elastic.co>
@anniegale9538 anniegale9538 merged commit 582c6c0 into main May 22, 2024
6 checks passed
@anniegale9538 anniegale9538 deleted the anniegale9538-patch-1 branch May 22, 2024 20:39
stefnestor added a commit that referenced this pull request May 22, 2024
👋 @shainaraskas @DaveCTurner @anniegale9538  as follow-up to #108263, this fixes the now targeted doc to make the recovery options look like alternatives rather than sequential steps.
stefnestor added a commit that referenced this pull request May 23, 2024
* (+Doc) Recover from "no_valid_shard_copy"

👋 @shainaraskas @DaveCTurner @anniegale9538  as follow-up to #108263, this fixes the now targeted doc to make the recovery options look like alternatives rather than sequential steps.

* Apply suggestions from code review

Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@elastic.co>

---------

Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >docs General docs changes >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants