5 messages in net.sourceforge.lists.opennms-discussRe: [opennms-discuss] Monitor and con...
FromSent OnAttachments
Mark WolekFeb 26, 2008 11:28 am 
Neil WatsonFeb 27, 2008 9:12 am 
Byron AndersonFeb 27, 2008 9:18 am 
AndreaFeb 27, 2008 12:26 pm 
Neil WatsonFeb 27, 2008 12:32 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: [opennms-discuss] Monitor and control a process?Actions...
From:Andrea (arus@comune.modena.it)
Date:Feb 27, 2008 12:26:41 pm
List:net.sourceforge.lists.opennms-discuss

Neil Watson ha scritto:

On Tue, Feb 26, 2008 at 01:28:24PM -0600, Mark Wolek wrote:

Is there a way to monitor and control a process / daemon within OpenNMS?

One option is to marry a NMS system like OpenNMS with a configuration management system like Cfengine. Onms could be instructed to trigger a cfengine run on the target host when particular events are generated.

Hi, i have some checks in place since a lot of months using net-snmp: basically what i do is using "Process checks" configuration in snmpd.conf file on the target system ( only for unix-like OS ). In this way i use the functionality of the net-snmp daemon to monitor certain processes running on a server, and return some value on specific OIDs depending if the service is running or not. Also is possible for the net-snmp daeomon to monitor how many process are running, and set an error flag as response if the desired amount of processes are not running. Digging in the net-snmp config file samples we see:

####### An example of the net-snmp configuration is as follows:

# Simply checks that mountd is running (1 or more processes) proc mountd # Make sure there are no more than 4 ntalkds running, but 0 is ok too. proc ntalkd 4 # Make sure at least one sendmail, but less than or equal to 10 are running. proc sendmail 10 1

Then, a snmpwalk of the prTable would look something like this:

# % snmpwalk -v 1 -c public localhost .EXTENSIBLEDOTMIB.PROCMIBNUM # enterprises.ucdavis.procTable.prEntry.prIndex.1 = 1 # enterprises.ucdavis.procTable.prEntry.prIndex.2 = 2 # enterprises.ucdavis.procTable.prEntry.prIndex.3 = 3 # enterprises.ucdavis.procTable.prEntry.prNames.1 = "mountd" # enterprises.ucdavis.procTable.prEntry.prNames.2 = "ntalkd" # enterprises.ucdavis.procTable.prEntry.prNames.3 = "sendmail" # enterprises.ucdavis.procTable.prEntry.prMin.1 = 0 # enterprises.ucdavis.procTable.prEntry.prMin.2 = 0 # enterprises.ucdavis.procTable.prEntry.prMin.3 = 1 # enterprises.ucdavis.procTable.prEntry.prMax.1 = 0 # enterprises.ucdavis.procTable.prEntry.prMax.2 = 4

So, in opennms, we first have to configure the capsd-configuration.xml file for the OpenNMS to discover the new services:

<protocol-plugin protocol="mountd" class-name="org.opennms.netmgt.capsd.SnmpPlugin" scan="on" user-defined="false"> <property key="force version" value="SNMPv2"/> <property key="vbname" value=".1.3.6.1.4.1.2021.2.1.2.0"/> <property key="vbvalue" value="mountd"/> <property key="timeout" value="2000"/> <property key="retry" value="3"/> </protocol-plugin> <protocol-plugin protocol="ntalkd" class-name="org.opennms.netmgt.capsd.SnmpPlugin" scan="on" user-defined="false"> <property key="force version" value="SNMPv2"/> <property key="vbname" value=".1.3.6.1.4.1.2021.2.1.2.1"/> <property key="vbvalue" value="ntalkd"/> <property key="timeout" value="2000"/> <property key="retry" value="3"/> </protocol-plugin> <protocol-plugin protocol="sendmail" class-name="org.opennms.netmgt.capsd.SnmpPlugin" scan="on" user-defined="false"> <property key="force version" value="SNMPv2"/> <property key="vbname" value=".1.3.6.1.4.1.2021.2.1.2.2"/> <property key="vbvalue" value="sendmail"/> <property key="timeout" value="2000"/> <property key="retry" value="3"/> </protocol-plugin>

Note that the use of the vbvalue parameter is to be sure the capsd discovers just the service that we wan"t to monitor.

The poller-configuration.xml file should look something like this:

<service name="mountd" interval="300000" user-defined="true" status="on"> <parameter key="retry" value="3"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.2021.2.1.100.0"/> <parameter key="expectedValue" value="0"/> </service> <service name="ntalkd" interval="300000" user-defined="true" status="on"> <parameter key="retry" value="3"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.2021.2.1.100.1"/> <parameter key="expectedValue" value="0"/> </service> <service name="sendmail" interval="300000" user-defined="true" status="on"> <parameter key="retry" value="3"/> <parameter key="timeout" value="3000"/> <parameter key="port" value="161"/> <parameter key="oid" value=".1.3.6.1.4.1.2021.2.1.100.2"/> <parameter key="expectedValue" value="0"/> </service>

The net-snmp protocol will return "1" in reponse to the query of the polled OID if the amount of running processes doest not match the parameters configured on the net-snmp configuration file, so we need to configure the expectedValue parameter in "0". Remember to add the monitor service deinition at the end of the poller configuration file:

<monitor service="mountd" class-name="org.opennms.netmgt.poller.SnmpMonitor"/> <monitor service="ntalkd" class-name="org.opennms.netmgt.poller.SnmpMonitor"/> <monitor service="sendmail" class-name="org.opennms.netmgt.poller.SnmpMonitor"/>

I also use this feature to monitor the return values after the execution of scripts on the target host.

In conjunction with this i think is possible to make opennms trigger a script ( i.e. , as an example, SSH to target host and RESTART the service ) based on OUTAGE EVENT ( i.e. expectedValue value ="1" on the supposed service... ); i usually set up notifications but i think trigger a script can be also acheived.

I never set up such a solution targetted on Windows services, so i don't know if it's possible or not.