Configuring Distributed Nagios with Check MK

What happen when we have several office or clients or datacenter where we need to monitor each place separately?

In those scenarios, we could reach the servers through a VPN. But sometimes (most of the time) that’s not the best way to do it or even possible. We are going to configure several Nagios servers, each one with specific servers monitored, centralizing them on one “master” Nagios. From this Master we will add new hosts and services, configure timeperiods, and everything that Nagios allows us from one single interface and one single point of access to our big (or huge) infrastructure.

Right now, I have 4 Nagios server inside the same network, but we could have them on different subnets or even through internet. We will only need access to 3 ports, so if you have a firewall between the nodes, you have to forward the traffic.

Nagios Master: 10.50.40.101
Nagios Site02: 10.50.40.102
Nagios Site03: 10.50.40.103
Nagios Site04: 10.50.40.104
Ports needed: 80 (HTTP) - 6556 (check_MK) – 6557 (livestatus socket)

First, we should configure Xinetd to allow remote access to the Livestatus socket. To do that we are going to create a file on xinetd.d directory and add the following content

vim /etc/xinetd.d/livestatus
service livestatus
{
    type        = UNLISTED
    port        = 6557
    socket_type    = stream
    protocol    = tcp
    wait        = no
# limit to 100 connections per second. Disable 3 secs if above.
    cps             = 100 3
# set the number of maximum allowed parallel instances of unixcat.
# Please make sure that this values is at least as high as
# the number of threads defined with num_client_threads in
# etc/mk-livestatus/nagios.cfg
        instances       = 500
# limit the maximum number of simultaneous connections from
# one source IP address
        per_source      = 250
# Disable TCP delay, makes connection more responsive
    flags           = NODELAY
    user        = nagios
    server        = /usr/bin/unixcat
    server_args     = /usr/local/nagios/var/rw/live
# configure the IP address(es) of your Nagios server here:
    only_from       = 127.0.0.1 10.50.40.101
    disable        = no
}

And then restart the service and check it:

systemctl restart xinetd
netstat  -putanl | grep xine
tcp        0      0 0.0.0.0:6556            0.0.0.0:*               LISTEN      3921/xinetd     
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      3921/xinetd

Before add a new Distributed Nagios we should add a host pointing to the new node to set as “Status Host”.

Distributed_Nagios_01

Distributed_Nagios_02

The “Agent type” could be the default “Check_MK_Agent” if you like to have all the checks of the host on the MASTER, or just “No Agent”.

Once we have the host created, we have to go to the “Rulesets” to set 2 things:

  • The TCP Port check for the socket (Port 6557)
  • The default command for Host Status (we are going to change the default Ping for the TCP Port check)

Go to “Hosts” and then go to “Rulesets”:

Distributed_Nagios_03

Distributed_Nagios_04

Here we should “search for rules” and look for this 2 rules:

“Check connecting to a TCP port” and “Host Check Command”

We are going to create/modified both.

First we have to add the TCP Port check for the new host.

Distributed_Nagios_05

We will create a rule on “Main directory” and set a couple of parameters for all out Nagios Nodes (then we could just edit the rule and add here the new Nodes)

Distributed_Nagios_06

And “Save” the changes.

Then we will change the default command check for the host (the one which change the status of the host to down if its not working)

Distributed_Nagios_07

You have to verified that the “Host Check Command” is set to “Use the status of the service” and “Check_LiveStatus_Socket” as service (the same name we put on the first Rule).

Finally, we have to set add the new Node and set the new Host as “Status Host” of the platform.

The first Site we should create is the MASTER:

Distributed_Nagios_08

And the Nodes:

Distributed_Nagios_09

Distributed_Nagios_10

Then we have to Save the changes and configure the credentials for Web Access to the new Node. To do that we should change first on all the nodes the password for our user.

We should change all the Password method for Automation

Distributed_Nagios_11

Distributed_Nagios_12

And apply all the changes.

Here we could Login or Logout from our nodes.

Distributed_Nagios_13

And the configuration is synced to all the nodes:

Distributed_Nagios_14

Distributed_Nagios_15

Now, when we add a new Host we could select on which Node has to be:

Distributed_Nagios_16

 

Print Friendly

Pablo Javier Furnari

Linux System Administrator at La Plata Linux
I'm a Linux Sysadmin with 8 years of experience. I work with several clients as a consulter here in Argentina and oversea (I have clients in the United States, Mexico, Pakistan and Germany).

I know my strengths and weaknesses. I'm a quick learner, I know how to work with small and big teams. I'm hard worker, proactive and I achieve everything I propose.

Leave a Reply

Your email address will not be published. Required fields are marked *


CAPTCHA Image
Reload Image