Monitor ESX datastores using Hostmonitor

A lot of customers I worked with use Advanced Hostmonitor from KS-Soft to monitor their Windows environment. So is the customer I  am currently working at. Advanced Hostmonitor is a simple and cheap monitoring solution for Windows, but also supports other UNIX/Linux operating systems using Remote Monitoring Agents (RMA).

My current >customer has 50+ ESX servers in production and I found out that de free space of the datastores reported by VMware Virtual Center is not always correct (checked up to VC2.5 Update2).
A couple of years ago when ESX 2.5 came out I installed an ESX server for the company I am currently employed at and did some investigation in using Hostmonitor to monitor the ESX server. So I picked up my old documentation and tried to use Hostmonitor to monitor the free space in our datastores on ESX 3.0.1 and ESX 3.5.

First we need to install the RMA for Linux in the ESX console.
(Before installing anything in the ESX service console, first increase the service console memory. I set the memory of the service console to the maximum of 800MB but 512MB will be sufficient.)

  • Create the directory /opt/rma
  • untar the rma125_lini32.tgz file in this directory using the command tar -xzvf rma125_lini32.tgz
  • Edit the rma.ini file and change the following parameters:
    RmaPath=/opt/rma/rma
    TmpDir=/opt/rma
    Password=<RMApassword>
    FilterList=127.0.0.1 <hostmonitor ip>
    UpdateAddr=127.0.0.1, <hostmonitor ip>
    ManageAddr=127.0.0.1, <hostmonitor ip>
  • Open the ESX firewall using esxcfg-firewall -o 1055.tcp,in,rma

We now need to register rma as a service in de Service Console.

  • Create a file named rma in /etc/init.d and paste the following code in it:
    #!/bin/bash
    # chkconfig: 2345 20 80
    # description: Hostmonitor RMA
    
    RMA_PWD="/opt/rma"
    RMA_BIN="${RMA_PWD}/rma"
    RMA_INI="${RMA_PWD}/rma.ini"
    
    case "$1" in
            start)
                    echo -n "Starting Hostmonitor RMA: "
                    ${RMA_BIN} -d ${RMA_INI} &>/dev/null
                    echo "done"
                    ;;
            stop)
                    echo -n "Stopping Hostmonitor RMA: "
                    PID="$(ps ax | grep ${RMA_BIN} | grep -v grep | awk '{print $1}')"
                    if [ x"${PID}" != x"" ]; then
                            kill ${PID} &>/dev/null
                            sleep 1
                            kill -9 ${PID} &>/dev/null
                    fi
                    echo "done"
                    ;;
            restart)
                    $0 stop
                    sleep 1
                    $0 start
                    ;;
            *)
                    echo "Usage: `basename $0` "
                    exit 1
                    ;;
    esac
    exit 0
  • Make the file executable using chmod a+x rma
  • Now register the rma service using chkconfig –add rma
  • We can now start the rma service using service rma start

Now open the RMA Manager on the Hostmonitor server and add the RMA
image_rma_001
For password enter the password as you defined in the rma.ini file
When you have ESX clusters defined, then you only need one RMA for each ESX cluster since the storage is shared. Alternatively you can add a second RMA to the cluster as a backup agent.

We now have everything in place for Hostmonitor to get onto the ESX server. Let’s get to the fun part and define the test for the datastore.
First we need to define the script to use in the script manager.

  • Open Hostmonitor and open the script manager (go to Test->New->Shell Script Test->Script Manager…)
  • Klik New and fill in the appropriate fields.
    For Platform select Linux.
    For Start cmd enter %Script% %Params%
  • Paste the following code in the script field:
    #!/bin/sh
    if [ $# -ge 2 ]
    then
      OS=`uname`
      case $OS in
        Linux)  LUNFREE=`vdf -h | grep $1 | awk '{print $3}'`
                LASTCHAR=${LUNFREE#${LUNFREE%?}}
                LUNFREE=${LUNFREE%?}
                case $LASTCHAR in
                  T)  LUNFREE=`echo $LUNFREE | awk '{printf "%d",$1 * 1024}'`
                      echo "ScriptRes:Ok:"$LUNFREE;;
                  G)  LUNFREE=`echo $LUNFREE | awk '{printf "%d",$1}'`
    			if [ $LUNFREE -lt $2 ]
                      then
                        echo "ScriptRes:Bad:"$LUNFREE
                      else
                        echo "ScriptRes:Ok:"$LUNFREE
                      fi;;
                  M)  LUNFREE=`echo $LUNFREE | awk '{printf "%d",$1 / 1024}'`
                      echo "ScriptRes:Bad:"$LUNFREE;;
                  *)  echo "ScriptRes:Bad:Script Error";;
                esac;;
        *) echo 'ScriptRes:Unknown:script is not designed for '$OS
           exit;;
      esac
    else
      echo 'ScriptRes:Unknown:not enough parameters specified'
    fi
  • Everything should look like this
    image_rma_003
  • Klik OK to save the changes.

Now we can add the actual test in Hostmonitor.

  • Go to Test->New->Shell Script Test and select the script we’ve just added. Mine is called ESX Storage:
  • For the script parameters enter the name of the datastore and the test value in GB.
    image_rma_005 
    Default the script time-out is set to 15 sec. but on large clusters the script can take some time to complete. Especially when multiple datastores test are performed on the same server.
    I set the time-out to 120sec. because we have many datastores on the clusters.
  • We’re all done. Klik OK to save your test

When everything is fine it should look like this.
image_rma_007
So it looks like I have 22GB free on datastore-001.

When the free space drops below 20GB the test will result BAD and turns red on the console. You can add actions to take in that case like sending an e-mail or SMS, but this is beyond the scope of this post.

No related posts.

6 Comments on “Monitor ESX datastores using Hostmonitor”

  1. #1 ludo
    on Nov 30th, 2009 at 5:46 pm

    Hi,
    I have tested your script on ESX 3.5 update 4 but i have this error :
    [root@frdev-esx1 root]# ./test.sh VMFS1 20
    awk: cmd. line:1:
    awk: cmd. line:1: ^ invalid char ” in expression
    ScriptRes:Bad:Script Error

    this is a result of ‘vdf -h’ :
    [root@frdev-esx1 root]# vdf -h
    Filesystem Size Used Avail Use% Mounted on
    /dev/sda2 4.9G 1.6G 3.0G 35% /
    /dev/sda1 99M 26M 69M 28% /boot
    none 132M 0 132M 0% /dev/shm
    /dev/sda6 2.0G 139M 1.7G 8% /var/log
    /vmfs/devices 10.0T 0 10.0T 0% /vmfs/devices
    /vmfs/volumes/4979def8-2a5d1b90-1703-001aa02cd64e
    264G 30G 233G 11% /vmfs/volumes/frdev-esx1:storage1
    /vmfs/volumes/49cd44b3-9bad0908-dce0-001aa02cd64e
    1.5T 1.1T 399G 73% /vmfs/volumes/VMFS1
    /vmfs/volumes/49cd48d3-49534128-0583-001aa02cd64e
    1.6T 1.5T 135G 91% /vmfs/volumes/VMFS2

    please can you help me ?

  2. #2 Arnim van Lieshout
    on Nov 30th, 2009 at 11:20 pm

    Hi Ludo,

    It looks like the code gets crippled during the copy & paste process.
    I’ve updated my post. You can now savely copy & paste the code.
    Also don’t forget to update the rma service script in /etc/init.d if you have used this.

    I run the same code on ESX3.5 U4 successfully, so there’s nothing wrong with the code itself.
    Let me know if it solved your problem.

    -Arnim

  3. #3 Frederik Eggers
    on May 31st, 2011 at 4:25 pm

    Hi,

    it´s possible to use this how-to for an ESX 4 Server?

    Frederik

  4. #4 Arnim van Lieshout
    on Jun 8th, 2011 at 10:38 am

    I think this should work for ESX4 too. Can’t test this myself now. Just give it a try.
    Possibly you’ll need to modify the monitor script.
    However it will only work on classic ESX and not on ESXi as you’re unable to install the RMA on ESXi.

    Keep in mind that ESX4.1 is the last version of classic ESX available. All new versions will only be ESXi hypervisor.

  5. #5 klinok
    on Jun 15th, 2011 at 9:51 am

    Thanks!!! all works, i use it for test linux volume: /
    but litle change for RedHat:
    Linux) LUNFREE=`df -h | grep $1 | awk ‘{print $3}’ | head -2`
    G) LUNFREE=`echo $LUNFREE | awk ‘{printf “%d”,$1 * 1024}’`

  6. #6 Arnim van Lieshout
    on Jun 20th, 2011 at 10:11 pm

    Thanks for sharing.

Leave a Comment