Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extent log message about conflicted shards #5026

Open
sergey-safarov opened this issue Apr 3, 2024 · 3 comments
Open

Extent log message about conflicted shards #5026

sergey-safarov opened this issue Apr 3, 2024 · 3 comments

Comments

@sergey-safarov
Copy link

Description

In my installation, in CouchDB logs messages I can see messages like

1 conflicted shard in cluster

I have tried to delete old databases and in the server logs can see

[critical] 2024-03-26T17:28:24.081510Z couchdb@db-2.example.com <0.16743.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:28:29.187245Z couchdb@db-2.example.com <0.18085.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:28:34.311446Z couchdb@db-2.example.com <0.21253.69> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:28:39.541320Z couchdb@db-2.example.com <0.31584.67> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:28:44.649152Z couchdb@db-2.example.com <0.20660.67> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:28:49.717562Z couchdb@db-2.example.com <0.23910.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:28:54.921189Z couchdb@db-2.example.com <0.9403.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:00.033066Z couchdb@db-2.example.com <0.19084.68> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:05.174271Z couchdb@db-2.example.com <0.25777.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:10.262848Z couchdb@db-2.example.com <0.30249.69> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:15.402787Z couchdb@db-2.example.com <0.27844.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:20.721704Z couchdb@db-2.example.com <0.20999.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:25.680320Z couchdb@db-2.example.com <0.100.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:30.777084Z couchdb@db-2.example.com <0.243.71> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:35.905120Z couchdb@db-2.example.com <0.23983.70> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:41.078100Z couchdb@db-2.example.com <0.4260.71> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:46.252992Z couchdb@db-2.example.com <0.5119.71> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:51.305345Z couchdb@db-2.example.com <0.6351.71> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:29:56.444482Z couchdb@db-2.example.com <0.6549.71> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:30:01.605705Z couchdb@db-2.example.com <0.5639.61> -------- 1 conflicted shard in cluster
[critical] 2024-03-26T17:30:06.766324Z couchdb@db-2.example.com <0.9687.69> -------- 1 conflicted shard in cluster

These log messages do not provide information about:

  1. database with conflicts;
  2. shard file;
  3. conflicted revision id.

Could you extent logs message?

Steps to Reproduce

I do not know

Expected Behaviour

Log messages should allow identifying conflicted database, shard, and revision id.

Your Environment

  • CouchDB version used:
{
  "couchdb": "Welcome",
  "version": "3.3.3",
  "git_sha": "40afbcfc7",
  "uuid": "385419103339-esrp",
  "features": [
    "access-ready",
    "partitioned",
    "pluggable-storage-engines",
    "reshard",
    "scheduler"
  ],
  "vendor": {
    "name": "The Apache Software Foundation"
  }
}
  • Browser name and version:
    Not used

  • Operating system and version:

NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

Additional Context

@nickva
Copy link
Contributor

nickva commented Apr 4, 2024

Conflicted shards means that the database metadata doc in the _dbs has a conflicts. That database is replicated in a circle (n1->n2->n3->n1) and since any node can receive database creation requests it's possible to get conflicts in those documents. We only use the winning (latest) version of the db but warn the user via custodian (a helper background watcher service) if there are conflicts present. I believe you can delete the conflicted revision one of the nodes (via /_node/_local/_dbs/$dbname) and then custodian will stop reporting it.

The code that detects is somewhere here: https://github.com/apache/couchdb/blob/main/src/custodian/src/custodian_util.erl#L206-L208

@sergey-safarov
Copy link
Author

sergey-safarov commented Apr 5, 2024

Could you update the log message to

1 conflicted shard in the cluster, please check document revisions for "_dbs" REST endpoint on all cluster nodes

@nickva
Copy link
Contributor

nickva commented Apr 9, 2024

I think it may work as a documentation string better as opposed a longer log message. Maybe a paragraph in the docs indicating what the error means, with a few examples, so it's easy to search, and how to react to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants