Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a new check for checking the number of shards in a cluster #108

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

thomasriley
Copy link

@thomasriley thomasriley commented Jan 27, 2018

Pull Request Checklist

fixes #107

General

  • Update Changelog following the conventions laid out here

  • Update README with any necessary configuration snippets

  • Binstubs are created if needed

  • RuboCop passes

  • Existing tests pass

Purpose

Add a new check that will alert when the number of shards in a ES cluster breaches a limit.

Known Compatibility Issues

None

Copy link
Member

@majormoses majormoses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great but I do have a couple of things that I think we should improve.

# == Elastic Search Shard Limit Status
#
class ESShardMaximumLimit < Sensu::Plugin::Check::CLI
option :scheme,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to use protocol as scheme sounds more like the scheme in a metrics script. I would also accept uri_protocol or something like that to make it obvious to it's intent.

Also I would suggest including the only valid options like this:

option :protocol,
           description: 'Protocol to make requests with',
           long: '--protocol PROTO',
           default: 'http',
           in: %(http https)

description: 'Port',
short: '-p PORT',
long: '--port PORT',
default: '9200'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a string this should be an integer.

shard_count = shard_count()
node_count = node_count()
if shard_count > (node_count * config[:node_limit])
critical "Shard count has breached the limit (#{shard_count}/#{node_count * config[:node_limit]})"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would like to specify a warning and a critical. The reason is that if we hit 1000 we have errors but I would like to have some warning at say 800 so that we can actively spot the need to escalate to get priority work done to figure out what the solution is before we have a problem.

@majormoses
Copy link
Member

@thomasriley any chance you are coming back to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New check for number of shards for the whole cluster
2 participants