Posts Tagged ‘nagios’

Remote Server Monitoring using phpSysInfo XML – SysInfoRM Concept

By Mark Davidson on February 10th, 2010

Recently I have been checking out a number of different solutions for monitoring remote servers and alerting or warning depending on certain metrics. For this I have looked at a number of different solutions including Server Density, Cloudkick, Cacti, Nagios and some others. Each of these has various advantages and disadvantages.Which I will give a very brief run down of below.

Server Density
The Good

  • iPhone Application
  • Push Notifications for iPhone Application
  • Easy to Deploy Agent

The Bad

  • Costs Per Server for Full Version
  • Limited Services it can monitor.

Cloudkick
The Good

  • Incredibly easy to setup monitoring of multiple hosts if your with one of the supported providers. I was using vps.net when I tried them out and it worked very well.
  • Very good metric monitoring.
  • Very nice interface.

The Bad

  • Limited to monitoring of supported providers.
  • Need to hand over API key for it to work.
  • In my opinion incredibly expensive.

Cacti
The Good

  • SNMP Integration
  • Great Interface
  • Very good metric support
  • Cost – Free

The Bad

  • Not Really a Remote Monitoring Solution as does not provide alerting. I know that it was never designed to be but definitely has potential to be expanded on.

Nagios
The Good

  • Highly Configurable
  • Support for Custom Monitoring Scripts
  • Great Alerting Configuration
  • Cost – Free

The Bad

  • Complicated to configure even for basic monitoring
  • Runs on single host meaning all network monitoring lost of host goes down.

Since none of the above solutions exactly suite my needs I have decided to produce my own monitoring solution. To do this I am going to take advantage of the fact that phpSysInfo provides the majority of the statistics that I wish to monitor and I already run it on most of my servers. phpSysInfo can supply its data in XML form this is that data I can use for monitoring.
So I plan to develop my own solution in Python to read the XML data from multiple remote hosts and then take a defined action if a rule is matched. Since I needed a name for this project I decided to call it SysInfoRM or System Information Remote Monitor.

Here is the basic feature plans for now

Initial Release Features for SysInfoRM

  • Parsing of 1 – n XML from either phpSysInfo or pySysInfo (Since it can supply data in same XML format and should be good for remote monitoring of non web hosts).
  • Ability to define monitoring rules and actions for when they are matched.
  • Easy / Understandable Configuration.
  • Heart Beat Monitoring of Script Itself. Basically run the script in two locations if the Primary fails the other should take over the monitoring.
  • Can be run as a Daemon or as interactive script.

Future Planned Features for SysInfoRM

  • Curses interface for viewing and configuration.
  • Web interface for viewing and configuration.

Python Libraries

  • urllib2
  • smtplib

I am hoping to have something working in the next couple of days just need to get my Python skills up to scratch. I am going to attempt to incorporate all the best bits from the other currently available monitoring tools and as few of the bad bits as possible. I should also mention I was inspired by and am drawing on another good monitoring script from the tomubuntu blog. Tom has done a really good job with this script and I suggest people check it out its a good simple script that can monitor Load Average, Memory and other stats; it can then alert using sSMTP.

That is all for now please check back soon I’ll be providing updates on the progress of SysInfoRM as I develop it and will post the SVN or Git Repos once I get one set up.

Gentoo & Nagios Configuration for Basic Remote Host Monitoring

By Mark Davidson on February 7th, 2010

Nagios is a very powerful monitoring solution that can be used to monitor the status of hosts and servers. This post is going to cover a basic setup of Nagios under Gentoo and configuring it to monitor the status of remote hosts.

First Add these lines to /etc/portage/package.use

net-analyzer/nagios-plugins nagios-dns nagios-ping nagios-ssh
net-analyzer/nagios-core vim-syntax
media-libs/gd jpeg png # You may need this line as well if your GD isn't already compiled with jpeg and png support.

Then emerge nagios

sudo emerge nagios
sudo chmod +x /etc/nagios/ # You don't have to do this but lets you ls the dir because permissions are a bit strict by default

now that nagios has been installed the next step is to enable it under apache. Edit /etc/conf.d/apache2 and add “-D NAGIOS” to the apache2 opts

APACHE2_OPTS="-D DEFAULT_VHOST -D INFO -D SSL -D SSL_DEFAULT_VHOST -D LANGUAGE -D SECURITY -D PHP5 -D STATUS -D INFO -D NAGIOS"

After doing so create a new .htaccess file in /usr/share/nagios/htdocs/ containing the following

AuthName "Nagios Access"
AuthType Basic
AuthUserFile /etc/nagios/auth.users
Require valid-user

make a copy of the file to /usr/lib/nagios/cgi-bin/.htaccess

sudo cp /usr/share/nagios/htdocs/.htaccess /usr/lib/nagios/cgi-bin/.htaccess

next create the htpasswd fileĀ  and restart apache

sudo htpasswd2 -c /etc/nagios/auth.users nagiosadmin
sudo apache2ctl restart

Now nagios should be configured and monitoring localhost with a number of checks, to check its working simply vist http://yourdomain.com/nagios/ and click the service details link on the menu providing everything is working you should see some service details and other status details about the localhost.

Providing everything went well we can now start monitoring some hosts remotely. There are many ways of doing so with Nagios I will cover some of these in a later tutorial but for now I will simply explain how to set up a check for PING, SSH and HTTP against a host.

Edit the /etc/nagios/nagios.cfg file and add this line any where below the log_file line.

cfg_dir=/etc/nagios/servers

next you need to create the dir /etc/nagios/servers and set it to be owned by nagios.

sudo mkdir /etc/nagios/servers
sudo chown nagios:nagios /etc/nagios/servers

now create a new .cfg named yourdomain.com.cfg and begin editing it. Add the following to the file save and exit.

define host{
    use                     linux-server
    host_name           server01.example.co.uk ; Change this to yourdomain
    address               83.XXX.XXX.XXX ; Change this to the IP of your domain
}

define service {
    use                     generic-service
    host_name               server01.example.co.uk ; Change this to your domain as above
    service_description     PING
    is_volatile             0
    check_period            24x7
    max_check_attempts      3
    normal_check_interval   5
    retry_check_interval    1
    notification_interval   240
    notification_period     24x7
    notification_options    w,u,c,r
    check_command           check_ping!100.0,20%!500.0,60%
}

define service {
    use                     generic-service
    host_name               server01.example.co.uk ; Change this to your domain as above
    service_description     SSH
    check_command           check_ssh
    notifications_enabled   0
}

define service {
    use                     generic-service
    host_name               server01.example.co.uk ; Change this to your domain as above
    service_description     HTTP
    check_command           check_http
    notifications_enabled   0
}

repeat this step for each of your hosts then restart nagios

sudo /etc/init.d/nagios restart

finally visit http://yourdomain.com/nagios/ click on the service details link again and you should see all your servers with status reports for the PING, HTTP and SSH monitoring.

Thats it for now any problems or questsions let me know I plan on covering the subject in more detail in a future post.

In the mean time some more details can be found at http://www.gentoo.org/doc/en/nagios-guide.xml and http://www.debian-administration.org/article/Using_Nagios_to_Monitor_Networks