Skip to content

Latest commit

 

History

History

nagios_server

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Nagios Server

Description

The nagios_server role installs and configures the Nagios monitoring system. The check configuration is generated from the Ansible inventory using various nagios_-prefixed host variables.

This role does not install a webserver; it configures the Nagios application only.

The config file templates are tightly coupled to the group names in the example inventory. If you change the group names, then you'll also need to modify the object templates.

Why Nagios?

I admit, Nagios is long in the tooth, and the interface leaves much to be desired. In my view, it has three major advantages:

  • You can just dnf install nagios, and you're ready to go. No supporting infrastructure required.

  • The configuration syntax is extremely simple, making it easy to automatically generate the config files.

  • Extending it with your own plugins is trivial: you just write a script that returns 0, 1, or 2.

You can use Nagios for metrics gathering, but its not well-suited to the task. In this project, its used purely for health checks.

I would have preferred to use Icinga for its slick interface, but they sadly put RPM packages behind a paywall recently.

Variables

This role accepts the following variables:

Variable Default Description
nagios_admin_email root@{{ email_domain }} Administrator's email address
nagios_admin_pager root@{{ email_domain }} Administrator's "pager" (not really used)
nagios_access_group role-nagios-access FreeIPA group of users allowed to access web interface (will be created)
nagios_email root@{{ email_domain }} Default contact email for alerts
nagios_reboot_window 03:00-05:00 Daily Time Period for host reboots
nagios_ssh_privkey   SSH private key for nagios user
nagios_excluded_groups [] List of Ansible group names to exclude from checks
nagios_ssh_control_persist 20m Timeout of persistent SSH connection
nagios_snmp_max_size 10000 Maximum size of SNMP responses (bytes)
nagios_manubulon_version master Git version of Manubulon to install
nagios_check_dns [] DNS checks to perform (see format below)
nagios_connectivity_check_host 8.8.8.8 Host to use for upstream connectivity check
nagios_connectivity_check_count 20 Number of ICMP packets to use for connectivity check
nagios_connectivity_check_rtt_warn 50.0 Round-trip-time warning threshold for connectivity check (ms)
nagios_connectivity_check_rtt_crit 100.0 Round-trip-time critical threshold for connectivity check (ms)
nagios_connectivity_check_loss_warn 5% Packet loss warning threshold for connectivity check
nagios_connectivity_check_loss_crit 20% Packet loss critical threshold for connectivity check

nagios_check_dns

The nagios_check_dns variable lists DNS checks to perform. It should contain a list of dictionaries of the following format:

Variable Default Description
name   FQDN to query
qtype A Query type
server   Upstream DNS server to query
expect   Expected response

Exports

This role exports the following variables:

Variable Description
nagios_html_dir Nagios webroot path
nagios_apache_config Apache config block for Nagios CGI application

Host Variables

In addition to variables for the nagios_server role itself, you can set various nagios_-prefixed hostvars to influence the check behavior for each host. Defaults for these host-specific variables are set in group_vars/all/nagios.yml in the example inventory.

Variable Description
nagios_snmp_user SNMPv3 username
nagios_snmp_community SNMP community string
nagios_snmp_auth_proto SNMPv3 authentication protocol
nagios_snmp_priv_proto SNMPv3 encryption protocol
nagios_snmp_auth_pass SNMPv3 authentication password
nagios_snmp_priv_pass SNMPv3 encryption password
nagios_ping_count ICMP packet count for hostalive check
nagios_ping_rtt_warn Round-trip time warning threshold for hostalive check
nagios_ping_rtt_crit Round-trip time critical threshold for hostalive check
nagios_ping_loss_warn Packet loss warning threshold for hostalive check
nagios_ping_loss_crit Packet loss critical threshold for hostalive check
nagios_temp_warn Temperature warning threshold (C)
nagios_temp_crit Temperature critical threshold (C)
nagios_power_draw_warn Power draw warning threshold (%)
nagios_power_draw_crit Power draw critical threshold (%)
nagios_load_1m_warn 1m load average (warn)
nagios_load_5m_warn 5m load average (warn)
nagios_load_15m_warn 15m load average (warn)
nagios_load_1m_crit 1m load average (crit)
nagios_load_5m_crit 5m load average (crit)
nagios_load_15m_crit 15m load average (crit)
nagios_mem_warn Memory usage warning threshold (%)
nagios_mem_crit Memory usage critical threshold (%)
nagios_swap_warn Swap usage warning threshold (%)
nagios_swap_crit Swap usage critical threshold (%)
nagios_interface_bandwidth_warn Interface bandwith warning threshold (Mbps)
nagios_interface_bandwidth_crit Interface bandwith critical threshold (Mbps)
nagios_interface_discard_warn Interface discards warning threshold (per second)
nagios_interface_discard_crit Interface discards critical threshold (per second)
nagios_interface_error_warn Interface errors warning threshold (per second)
nagios_interface_error_crit Interface errors critical threshold (per second)
nagios_interfaces Per-interface threshold overrides (see format below)
nagios_disk_warn Disk usage warning threshold (%)
nagios_disk_crit Disk usage critical threshold (%)
nagios_disks Per-filesystem threshold overrides (see format below)
nagios_certificate_warn Certificate validity days remaining (warning)
nagios_certificate_crit Certificate validity days remaining (critical)
nagios_smtp_warn SMTP response time warning threshold (seconds)
nagios_smtp_crit SMTP response time critical threshold (seconds)
nagios_mailq_warn Mail queue warning size
nagios_mailq_crit Mail queue critical size
nagios_imap_warn IMAP response time warning threshold (seconds)
nagios_imap_crit IMAP response time warning threshold (seconds)
nagios_http_warn HTTP response time warning threshold (seconds)
nagios_http_crit HTTP response time warning threshold (seconds)

nagios_interfaces

The nagios_interfaces variable is used to specify check thresholds for each network interface independently. It should contain a list of dictionaries of the following format:

Variable Default Description
name   Interface name
regex   Regular expression matching one or more interfaces
description interface name Nagios check name
down_ok no Don't alert when interface is down
bandwidth_warn {{ nagios_interface_bandwidth_warn }} Bandwidth warning threshold (Mbps)
bandwidth_crit {{ nagios_interface_bandwidth_crit }} Bandwidth critical threshold (Mbps)
discard_warn {{ nagios_interface_discard_warn }} Discard warning threshold (per second)
discard_crit {{ nagios_interface_discard_crit }} Discard critical threshold (per second)
error_warn {{ nagios_interface_error_warn }} Error warning threshold (per second)
error_crit {{ nagios_interface_error_crit }} Error critical threshold (per second)

The nagios_interfaces variable can also contain a simple list of interface names, in which case the default check thresholds will be used.

nagios_disks

The nagios_disks variable is used to specify check thresholds for each filesystem independently. It should contain a list of dictionaries of the following format:

Variable Default Description
path   Path of the disk's mountpoint
regex   Regular expression matching one or more mountpoints
description mount path Nagios check name
exclude no Treat mountpoint as exclusion pattern
terse no Use shorter check output
warn {{ nagios_disk_warn }} Disk usage warning threshold (%)
crit {{ nagios_disk_crit }} Disk usage critical threshold (%)

The nagios_disks variable can also contain a simple list of mountpoints, in which case the default check thresholds will be used.

Usage

Example playbook:

- name: configure nagios monitoring server
  hosts: nagios_servers
  roles:
    - role: nagios_server
      vars:
        nagios_check_dns:
          - name: example.com
            qtype: A
            server: 8.8.8.8
            expect: 1.2.3.4

    - role: apache_vhost
      vars:
        apache_document_root: '{{ nagios_html_dir }}'
        apache_config: '{{ nagios_apache_config }}'