Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inaccurate Bucket Intervals for celestia_consensus_block_interval_seconds_bucket Metric #1307

Open
jevonearth opened this issue Apr 16, 2024 · 0 comments
Labels
ice-box issues are automatically assigned this label until they are planned. introspection T:Bug Type: Bug (confirmed)

Comments

@jevonearth
Copy link
Contributor

jevonearth commented Apr 16, 2024

Bug Report

Setup

CometBFT version: N/A

Have you tried the latest version: yes

ABCI app: N/A

Environment: N/A

What happened?

The block_interval_seconds metric uses the default Prometheus histogram buckets which do not align with the expected block times for the Celestia blockchain. As the typical block time is approximately 12 seconds, using the default bucket ranges leads to an inaccurate representation of the block interval times.

What did you expect to happen?

The histogram for the block_interval_seconds metric should use buckets that align closely with the typical block times observed in Celestia, providing a detailed and accurate representation of block intervals.

How to reproduce it

  1. Observe the block_interval_seconds metric under normal operation.
  2. Note that the majority of blocks fall into a +Inf histogram buckets, and no blocks get counted in any of the buckets below 10.
# HELP celestia_consensus_block_interval_seconds Time between this and the last block.
# TYPE celestia_consensus_block_interval_seconds histogram
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.005"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.01"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.025"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.05"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.1"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.25"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="0.5"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="1"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="2.5"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="5"} 0
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="10"} 1028
celestia_consensus_block_interval_seconds_bucket{chain_id="celestia",version="1.7.0",le="+Inf"} 29770

Logs

n/a

dump_consensus_state output

n/a

Anything else we need to know

The proposed trivial fix involves adjusting the histogram bucket ranges to more appropriately reflect the observed block times in Celestia. Suggested bucket configuration is as follows:

Buckets: []float64{10, 11, 12, 13, 14, 15, 20, 25, 30, 40, 50, 60},

This setup provides finer granularity around the expected median block time of approximately 12 seconds, improving monitoring and analysis capabilities.

Tasks

No tasks being tracked yet.
@evan-forbes evan-forbes added T:Bug Type: Bug (confirmed) introspection ice-box issues are automatically assigned this label until they are planned. and removed needs:triage labels May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ice-box issues are automatically assigned this label until they are planned. introspection T:Bug Type: Bug (confirmed)
Projects
None yet
Development

No branches or pull requests

2 participants