You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RAFT currently is implemented and works, but we lack strong observability to the internals of it to help running weaviate in production and have actionable information during incidents.
Observability front
Provide a dedicated set of raft-related metrics (current leader, time between election, local log lags, etc...)
Provide/improve internal dashboards to include these metrics
Provide/improve internal alerting to have alerts related to these metrics (split brain alert, quorum near loss, etc...)
Operations front
Provide a CLI tool to interact with a weaviate cluster (list current nodes and raft status, list shard distribution, list backups, etc...)
Provide the ability (using the CLI above ?) to manually intervene on a weaviate cluster and force a raft state (force a node leader, add/remove nodes manually, etc...)
The text was updated successfully, but these errors were encountered:
RAFT currently is implemented and works, but we lack strong observability to the internals of it to help running weaviate in production and have actionable information during incidents.
Observability front
Operations front
The text was updated successfully, but these errors were encountered: