Saturday 2 June 2012

How to install nagios on RHEL 6 / CentOS 6


                                                  How to install nagios on RHEL 6 / CentOS 6  
                        
Nagios is a powerful network monitoring tool in the opensource community. Here I  explained how to install nagios on RHEL 6/CENTOS 6. In addition to that I added how to bind nagios with NRPE plugin.

Step 1: Install the following required packages if not present

# yum install httpd php

# yum install gcc glibc glibc-common

# yum install gd gd-devel

Step 2: Create an account Information

Switch to root user if not

# su -l

Create a new nagios user account and give it a password.



# /usr/sbin/useradd -m nagios

# passwd nagios

# /usr/sbin/groupadd nagcmd

# /usr/sbin/usermod -a -G nagcmd nagios

# /usr/sbin/usermod -a -G nagcmd apache

# wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.2.3.tar.gz

# wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz

# cd ~/downloads

Step 3: Untar the downloaded package and install it from source

# tar xzf nagios-3.2.3.tar.gz

# cd nagios-3.2.3

# ./configure --with-command-group=nagcmd

# make all

# make install

# make install-init

# make install-config

# make install-commandmode

# vi /usr/local/nagios/etc/objects/contacts.cfg

# make install-webconf

# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

# service httpd restart



Installing nagios plugin


# cd ~/downloads

# tar xzf nagios-plugins-1.4.11.tar.gz

# cd nagios-plugins-1.4.11

Compile and install the plugins.

# ./configure --with-nagios-user=nagios --with-nagios-group=nagios

# make

# make install

Start Nagios

Add Nagios to the list of system services and have it automatically start when the system boots.

# chkconfig --add nagios

# chkconfig nagios on

Verify the sample Nagios configuration files.

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

If there are no errors, start Nagios.

# service nagios start


INSTALL NRPE DAEMON:


# wget http://osdn.dl.sourceforge.net/sourceforge/nagios/nrpe-2.8.tar.gz

Extract the NRPE source code tarball.

# tar xzf nrpe-2.8.tar.gz

# cd nrpe-2.8

Compile the NRPE addon.

# ./configure

#make all

Install the NRPE plugin (for testing), daemon, and sample daemon config file.

# make install-plugin

# make install-daemon

# make install-daemon-config

Install the NRPE daemon as a service under xinetd.

# make install-xinetd

Edit the /etc/xinetd.d/nrpe file and add the IP address of the monitoring server to the only_from directive.

" only_from = 127.0.0.1 <nagios_ip_address> "

Add the following entry for the NRPE daemon to the /etc/services file.

" nrpe    5666/tcp     # NRPE "

Restart the xinetd service.

# service xinetd restart

Its time to see if things are working properly...

Make sure the nrpe daemon is running under xinetd.

# netstat -at | grep nrpe

The output out this command should show something like this:

" tcp   0   0   *:nrpe     *:*     LISTEN "

Next, check to make sure the NRPE daemon is functioning properly. To do this, run the check_nrpe plugin thatwas installed for testing purposes.

# /usr/local/nagios/libexec/check_nrpe -H localhost

You should get a string back that tells you what version of NRPE is installed, like this:

NRPE v2.8

Open firewall rules:


Make sure that the local firewall on the machine will allow the NRPE daemon to be accessed from remote servers.
To do this, run the following iptables command. Note that the RH-Firewall-1-INPUT chain name is RedHat-specific, so it will be different on other Linux distributions.

# iptables -I RH-Firewall-1-INPUT -p tcp -m tcp –dport 5666 -j ACCEPT

Save the new iptables rule so it will survive machine reboots.

# service iptables save

Detailed documentation on installing nagios with nrpe is mentioned here

Reference: http://linuxserverworld.com/how-to-install-nagios-with-nrpe-on-centos-6rhel-6/

**********************************************************************************

Customize NRPE commands

The sample NRPE config file that got installed contains several command definitions that you'll likely use to monitor this machine. The command definitions are used to (surprise) define commands that the NRPE daemon
will run to monitor local resources and services.  You can edit the command definitions, add new commands, etc, by editing the NRPE
config file:

# vim /usr/local/nagios/etc/nrpe.cfg

For the time being, I'll assume you're using the sample commands that are defined. You can test some of these
by running the following commands:
/usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe
-H
-H
-H
-H
-H
localhost
localhost
localhost
localhost
localhost
-c
-c
-c
-c
-c
check_users
check_load
check_hda1
check_total_procs
check_zombie_procs


TROUBLESHOOTING
Here are some tips for troubleshooting some of the more common errors with the NRPE addon. If you encounter
problems that aren't covered here, send an email to the nagios-users mailing list. More information on the mailing
lists can be found at: http://www.nagios.org/support/

The check_nrpe plugin returns

                 "CHECK_NRPE: Socket timeout after 10 seconds" or "Connection refused or timed out"

This error can indicate several things:

The command that the NRPE daemon was asked to run took longer than 10 seconds to execute. This is the
most likely cause if the error message was "CHECK_NRPE: Socket timeout after 10 seconds". Use the -t
command line option to specify a longer timeout for the check_nrpe plugin. The following example will
increase the timeout to 30 seconds:
                 
                 #/usr/local/nagios/check_nrpe -H localhost -c somecommand -t 30

The NRPE daemon is not installed or running on the remote host. Verify that the NRPE daemon is running as
a standalone daemon or under inetd/xinetd with one of the following commands:


                 #ps axuw | grep nrpe

                 #netstat -at | grep nrpe


There is a firewall that is blocking the communication between the monitoring host (which runs the check_nrpe
plugin) and the remote host (which runs the NRPE daemon). Verify that the firewall rules (e.g. iptables) that
are running on the remote host allow for communication and make sure there isn't a physical firewall that is
located between the monitoring host and the remote host.

The check_nrpe plugin returns

                  "CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for an error message."

First thing you should do is check the remote server logs for an error message. Seriously. :-) This error could be
due to the following problem:


The check_nrpe plugin was unable to complete an SSL handshake with the NRPE daemon. An error
message in the logs should indicate whether or not this was the case. Check the versions of OpenSSL that
are installed on the monitoring host and remote host. If you're running a commercial version of SSL on the
remote host, there might be some compatibility problems.

The check_nrpe plugin returns

                  "NRPE: Unable to read output"

This error indicates that the command that was run by the NRPE daemon did not return any character output.
This could be an indication of the following problems:
– An incorrectly defined command line in the command definition. Verify that the command definition in your
   NRPE configuration file is correct.
– The plugin that is specified in the command line is malfunctioning. Run the command line manually to make
   sure the plugin returns some kind of text output.

The check_nrpe plugin returns

                  "NRPE: Command 'x' not defined"

This error means that you didn't define command x in the NRPE configuration file on the remote host. On the
remote host, add the command definition for x. See the existing command definitions in the NRPE configuration
file for more information on doing this. If you're running the NRPE daemon as a standalone daemon (and not
under inetd or xinetd), you'll need to restart it in order for the new command to be recognized.

The check_nrpe plugin returns

                 "NRPE: Command timed out after x seconds"

This error indicates that the command that was run by the NRPE daemon did not finish executing within the
specified time. You can increase the timeout for commands by editing the NRPE configuration file and changing
the value of the command_timeout variable. If you're running the NRPE daemon as a standalone daemon (and
not under inetd or xinetd), you'll need to restart it in order for the new timeout to be recognized.
How to go about debugging other problems...

When debugging problems it may be useful to edit the NRPE configuration file and change the debug=0 entry to
debug=1. Once you do that, restart the NRPE daemon if it is running as a standalone daemon. After you try using
the check_nrpe plugin again, you should be able to see some debugging information in the log files of the remote
host. Check your logs carefully – they should be able to help provide clues as to where the problem lies...



*******************************************************************************


problem;


#vim /etc/xinetd.d/nrpe

#vim /etc/services
nrpe 5666/tcp

#/usr/local/nagios/libexec/check_nrpe -H localhost

#ps -aux | grep nrpe

#nestat -at | grep nrpe

#/usr/local/nagios/libexec/check_nrpe -H localhost -c check_load

#



*************************************************************************************

configuration:

Test communication with the NRPE daemon

#/usr/local/nagios/libexec/check_nrpe -H 192.168.0.1

COMMAND CONFIGURATION:

vim /usr/local/nagios/etc/commands.cfg:

define command{
command_name   check_nrpe
command_line   $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}


Create host and service definitions:

HOST DEFINITION:

vim /usr/local/nagios/etc/commands.cfg:

define host{
use              linux-box
host_name        remotehost
alias            Fedora Core 6
address          192.168.0.1
}


SERVICE DEFINITION:

vim /usr/local/nagios/etc/services.cfg:

define service{
use                   generic-service
host_name             remotehost
service_description   CPU Load
check_command         check_nrpe!check_load
}


Remote Host Configuration:

vi /usr/local/nagios/etc/nrpe.cfg

    command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%

5 comments:

  1. Awesome .... good explanation :)
    thank you for the post

    ReplyDelete
  2. Internal Server Error

    The server encountered an internal error or misconfiguration and was unable to complete your request.

    Please contact the server administrator, root@localhost and inform them of the time the error occurred, and anything you might have done that may have caused the error.

    More information about this error may be available in the server error log.
    Apache/2.2.15 (Red Hat) Server at localhost Port 80

    ReplyDelete
  3. Internal Server Error

    The server encountered an internal error or misconfiguration and was unable to complete your request.

    Please contact the server administrator, root@localhost and inform them of the time the error occurred, and anything you might have done that may have caused the error.
    More information about this error may be available in the server error log.
    Apache/2.2.15 (Red Hat) Server at localhost Port 80

    Could someone help me out with this...when im trying to access the dashboard its throwing the above eeror

    ReplyDelete
    Replies
    1. Many guys online are saying this worked for them. Try it

      Put SELinux into Permissive mode.

      # setenforce 0

      Delete
  4. During this website, you will see this shape, i highly recommend you learn this review. https://onohosting.com/

    ReplyDelete