MUC-FRA-LND-NYK-LND-BRU-LND-BRU-AMS-BRU-DEAD

Interesting routing issue this morning. Normally this goes MUC-NBG.

traceroute to 178.63.61.72 (178.63.61.72), 64 hops max, 52 byte packets
 1  192.168.0.1 (192.168.0.1)  1.423 ms  1.050 ms  3.151 ms
 2  ppp-default.m-online.net (82.135.16.28)  358.439 ms  204.568 ms  138.582 ms
 3  gi0-0-0-32-171.r4.muc2.m-online.net (212.18.7.37)  32.430 ms  78.920 ms  11.349 ms
 4  xe-0-3-0.r3.muc2.m-online.net (82.135.16.202)  240.887 ms
    xe-1-1-0.r3.muc7.m-online.net (82.135.16.242)  23.796 ms  6.142 ms
 5  62.140.24.49 (62.140.24.49)  9.266 ms  64.491 ms  30.544 ms
 6  ae-4-4.ebr1.frankfurt1.level3.net (4.69.134.2)  12.860 ms  20.502 ms  31.932 ms
 7  ae-81-81.csw3.frankfurt1.level3.net (4.69.140.10)  13.089 ms
    ae-71-71.csw2.frankfurt1.level3.net (4.69.140.6)  60.880 ms
    ae-91-91.csw4.frankfurt1.level3.net (4.69.140.14)  66.135 ms
 8  ae-62-62.ebr2.frankfurt1.level3.net (4.69.140.17)  12.420 ms
    ae-82-82.ebr2.frankfurt1.level3.net (4.69.140.25)  157.467 ms
    ae-72-72.ebr2.frankfurt1.level3.net (4.69.140.21)  12.591 ms
 9  ae-23-23.ebr2.london1.level3.net (4.69.148.193)  47.827 ms  36.042 ms  53.680 ms
10  ae-41-41.ebr1.newyork1.level3.net (4.69.137.66)  185.216 ms
    ae-44-44.ebr1.newyork1.level3.net (4.69.137.78)  190.551 ms
    ae-41-41.ebr1.newyork1.level3.net (4.69.137.66)  95.741 ms
11  ae-91-91.csw4.newyork1.level3.net (4.69.134.78)  178.675 ms
    ae-61-61.csw1.newyork1.level3.net (4.69.134.66)  104.084 ms
    ae-91-91.csw4.newyork1.level3.net (4.69.134.78)  159.411 ms
12  ae-2-70.edge1.newyork1.level3.net (4.69.155.78)  101.230 ms  91.009 ms  173.873 ms
13  4.68.110.154 (4.68.110.154)  107.450 ms  140.533 ms  204.142 ms
14  sl-crs2-lon-0-8-3-0.sprintlink.net (144.232.9.162)  184.150 ms  94.185 ms  91.192 ms
15  sl-bb20-bru-14-0-0.sprintlink.net (213.206.129.42)  96.598 ms
    sl-bb23-lon-0-0-0.sprintlink.net (213.206.128.185)  117.846 ms
    sl-bb20-bru-14-0-0.sprintlink.net (213.206.129.42)  129.868 ms
16  sl-bb21-ams-3-0-0.sprintlink.net (213.206.129.142)  214.406 ms
    sl-bb21-bru-15-0-0.sprintlink.net (80.66.128.42)  153.944 ms *
17  * * *
18  * * *
19  * * *

HowTo: OpenNMS and Google Search Appliance

First step is to enable SNMP on the gsa with a well choosen community string.

You also need the two MIB files from google, the GOOGLE-MIB and the GSA_MIB.
Copy the MIB files to your OpenNMS machine into the directory

$OPENNMS_DIR/contrib/mibparser/mibs

Next step is to convert these mibs into OpenNMS configuration. With the mibparser we can do this:

$OPENNMS_DIR/contrib/mibparser/dist/parseMib.sh \
$OPENNMS_DIR/contrib/mibparser/mibs/GOOGLE-MIB $OPENNMS_DIR/contrib/mibparser/mibs/GSA-MIB >$HOME/google.txt
<mibObj oid=".1.3.6.1.4.1.11129.1.1.1" instance="0" alias="crawlRunning" type="INTEGER" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.1" instance="0" alias="docsServed" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.2" instance="0" alias="crawlingRate" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.3" instance="0" alias="docBytes" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.4" instance="0" alias="todayDocsCrawled" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.5" instance="0" alias="docErrors" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.6" instance="0" alias="docsFound" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.7" instance="0" alias="batchCrawlRunning" type="INTEGER" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.8" instance="0" alias="batchCrawlStartTime" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.9" instance="0" alias="batchCrawlEndTime" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.1.2.10" instance="0" alias="batchCrawlEndReason" type="INTEGER" />
<mibObj oid=".1.3.6.1.4.1.11129.1.2.1" instance="0" alias="qpm" type="Integer32" />
<mibObj oid=".1.3.6.1.4.1.11129.1.3.1.1" instance="0" alias="diskHealth" type="INTEGER" />
<mibObj oid=".1.3.6.1.4.1.11129.1.3.1.2" instance="0" alias="diskErrors" type="DisplayString" />
<mibObj oid=".1.3.6.1.4.1.11129.1.3.2.1" instance="0" alias="temperatureHealth" type="INTEGER" />
<mibObj oid=".1.3.6.1.4.1.11129.1.3.2.2" instance="0" alias="temperatureErrors" type="DisplayString" />
<mibObj oid=".1.3.6.1.4.1.11129.1.3.3.1" instance="0" alias="machineHealth" type="INTEGER" />
<mibObj oid=".1.3.6.1.4.1.11129.1.3.3.2" instance="0" alias="machineErrors" type="DisplayString" />

This output still needs to be manualy converted to be useful in OpenNMS. We change the types “INTEGER” and “Integer32” into “Gauge32“. The lines with type “DisplayString” are deleted, as OpenNMS can do reports for them. The changes are going into the file $OPENNMS_DIR/etc/datacollection-config.xml. Here is the, what needs to be added to it:

This goes into the groups section:

<group name="googleGSA" ifType="ignore">
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.1" instance="0" alias="crawlRunning" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.1" instance="0" alias="docsServed" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.2" instance="0" alias="crawlingRate" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.3" instance="0" alias="docBytes" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.4" instance="0" alias="todayDocsCrawled" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.5" instance="0" alias="docErrors" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.6" instance="0" alias="docsFound" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.7" instance="0" alias="batchCrawlRunning" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.8" instance="0" alias="batchCrawlStartTime" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.9" instance="0" alias="batchCrawlEndTime" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.1.2.10" instance="0" alias="batchCrawlEndReason" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.2.1" instance="0" alias="qpm" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.3.1.1" instance="0" alias="diskHealth" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.3.2.1" instance="0" alias="temperatureHealth" type="Gauge32" />
  <mibObj oid=".1.3.6.1.4.1.11129.1.3.3.1" instance="0" alias="machineHealth" type="Gauge32" />
</group>

This goes into the systems section:

<systemDef name="googleGSA">
  <sysoidMask>.1.3.6.1.4.1.11129.1.</sysoidMask>
    <collect>
      <includeGroup>googleGSA</includeGroup>
  </collect>
</systemDef>

We also add thresholding setup into the $OPENNMS_DIR/etc/thresholds.xml:

<group name="gooleGSA" rrdRepository="/opt/opennms/share/rrd/snmp/">
  <threshold type="high" ds-name="machineHealth" ds-type="node" value="1" rearm="0" trigger="1" />
  <threshold type="high" ds-name="temperatureHealth" ds-type="node" value="1" rearm="0" trigger="1" />
  <threshold type="high" ds-name="diskHealth" ds-type="node" value="1" rearm="0" trigger="1" />
  <threshold type="low" ds-name="crawlRunning" ds-type="node" value="0" rearm="1" trigger="1" />
</group>

Before we can use this, we can check for syntax with:

 xmllint $OPENNMS_DIR/etc/datacollection-config.xml

Also check the thresholds.xml! We need to restart the OpenNMS now and add the gsa to the discovery to get data for it.

Last step is the setup of the graph definitions in the file $OPENNMS/etc/snmp-graph.properties. Here is the content:

This needs to be appended at reports string (don’t forget the “,” and the “\”)

googleGSA.crawlRunning, googleGSA.docsServed, googleGSA.crawlingRate, googleGSA.docBytes, googleGSA.todayDocsCrawled, googleGSA.docErrors, googleGSA.docsFound, googleGSA.qpm, googleGSA.diskHealth, googleGSA.temperatureHealth, googleGSA.machineHealth

This are the graph definitions:

report.googleGSA.docsServed.name=Number of document being served
report.googleGSA.docsServed.columns=docsServed
report.googleGSA.docsServed.type=nodeSnmp
report.googleGSA.docsServed.command=--title="Documents served" \
--vertical-label Documents \
DEF:docsServed={rrd1}:docsServed:AVERAGE \
AREA:docsServed#008800:"Docs" \
GPRINT:docsServed:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:docsServed:MIN:"Min \\: %10.2lf %s" \
GPRINT:docsServed:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.crawlingRate.name=The current crawling rate in pages per second.
report.googleGSA.crawlingRate.columns=crawlingRate
report.googleGSA.crawlingRate.type=nodeSnmp
report.googleGSA.crawlingRate.command=--title="Crawling rate" \
--vertical-label="per second" \
DEF:crawlingRate={rrd1}:crawlingRate:AVERAGE \
LINE2:crawlingRate#008800:"crawl rate/s" \
GPRINT:crawlingRate:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:crawlingRate:MIN:"Min \\: %10.2lf %s" \
GPRINT:crawlingRate:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.docBytes.name=The total megabytes processed so far.
report.googleGSA.docBytes.columns=docBytes
report.googleGSA.docBytes.type=nodeSnmp
report.googleGSA.docBytes.command=--title="Document size served" \
--vertical-label Bytes \
DEF:docBytes={rrd1}:docBytes:AVERAGE \
AREA:docBytes#008800:"Docs Bytes" \
GPRINT:docBytes:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:docBytes:MIN:"Min \\: %10.2lf %s" \
GPRINT:docBytes:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.todayDocsCrawled.name=The number of documents crawled today.
report.googleGSA.todayDocsCrawled.columns=todayDocsCrawled
report.googleGSA.todayDocsCrawled.type=nodeSnmp
report.googleGSA.todayDocsCrawled.command=--title="Documents crawled" \
--vertical-label="daily" \
DEF:todayDocsCrawled={rrd1}:todayDocsCrawled:AVERAGE \
AREA:todayDocsCrawled#008800:"Crawled today" \
GPRINT:todayDocsCrawled:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:todayDocsCrawled:MIN:"Min \\: %10.2lf %s" \
GPRINT:todayDocsCrawled:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.docErrors.name=Number of times an error occurred while trying to crawl a document.
report.googleGSA.docErrors.columns=docErrors
report.googleGSA.docErrors.type=nodeSnmp
report.googleGSA.docErrors.command=--title="Document Errors" \
--vertical-label="daily" \
DEF:docErrors={rrd1}:docErrors:AVERAGE \
AREA:docErrors#AA0000:"Doc Errors" \
GPRINT:docErrors:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:docErrors:MIN:"Min \\: %10.2lf %s" \
GPRINT:docErrors:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.docsFound.name=Total documents found.
report.googleGSA.docsFound.columns=docsFound
report.googleGSA.docsFound.type=nodeSnmp
report.googleGSA.docsFound.command=--title="Documents found" \
--vertical-label="Documents" \
DEF:docsFound={rrd1}:docsFound:AVERAGE \
AREA:docsFound#008800:"Docs found" \
GPRINT:docsFound:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:docsFound:MIN:"Min \\: %10.2lf %s" \
GPRINT:docsFound:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.qpm.name=Serving status in terms of queries per minute handled.
report.googleGSA.qpm.columns=qpm
report.googleGSA.qpm.type=nodeSnmp
report.googleGSA.qpm.command=--title="Queries" \
--vertical-label="per minute" \
DEF:qpm={rrd1}:qpm:AVERAGE \
LINE2:qpm#008800:"Queries/min" \
GPRINT:qpm:AVERAGE:"Avg \\: %10.2lf %s" \
GPRINT:qpm:MIN:"Min \\: %10.2lf %s" \
GPRINT:qpm:MAX:"Max \\: %10.2lf %s\\n" \
\
report.googleGSA.diskHealth.name=Disk status
report.googleGSA.diskHealth.columns=diskHealth
report.googleGSA.diskHealth.type=nodeSnmp
report.googleGSA.diskHealth.command=--title="Disk status" \
--vertical-label status \
DEF:diskHealth={rrd1}:diskHealth:AVERAGE \
AREA:diskHealth#FF0000:"0:OK 1:Warning 2:Critical" \
GPRINT:diskHealth:AVERAGE:"Avg \\: %2.2lf %s" \
GPRINT:diskHealth:MIN:"Min \\: %2.2lf %s" \
GPRINT:diskHealth:MAX:"Max \\: %2.2lf %s\\n" \
\
report.googleGSA.temperatureHealth.name=Temperature status
report.googleGSA.temperatureHealth.columns=temperatureHealth
report.googleGSA.temperatureHealth.type=nodeSnmp
report.googleGSA.temperatureHealth.command=--title="Temperature status" \
--vertical-label status \
DEF:temperatureHealth={rrd1}:temperatureHealth:AVERAGE \
AREA:temperatureHealth#FF0000:"0:OK 1:Warning 2:Critical" \
GPRINT:temperatureHealth:AVERAGE:"Avg \\: %2.2lf %s" \
GPRINT:temperatureHealth:MIN:"Min \\: %2.2lf %s" \
GPRINT:temperatureHealth:MAX:"Max \\: %2.2lf %s\\n" \
\
report.googleGSA.machineHealth.name=Machine status
report.googleGSA.machineHealth.columns=machineHealth
report.googleGSA.machineHealth.type=nodeSnmp
report.googleGSA.machineHealth.command=--title="Machine status" \
--vertical-label status \
DEF:machineHealth={rrd1}:machineHealth:AVERAGE \
AREA:machineHealth#FF0000:"0:OK 1:Warning 2:Critical" \
GPRINT:machineHealth:AVERAGE:"Avg \\: %2.2lf %s" \
GPRINT:machineHealth:MIN:"Min \\: %2.2lf %s" \
GPRINT:machineHealth:MAX:"Max \\: %2.2lf %s\\n" \

That’s it!