Juniper SRX Chassis Cluster RG0 Nagios Check

I was required to check (as this customer did not have a trap collector) which node was active for redundancy group 0 on a SRX cluster. So I thought I would check for a SNMP OID that is only presented by the active RG0 node. This script uses snmpwalk and is configured to use SNMP v2c (this can be easily changed). It has been tested on:

  • CentOS 5
  • Junos 11.4R2
  • SNMP v2c

Here is the little hacky shell script:

[bash]
#!/bin/bash

# Cooper Lees <me@cooperlees.com>
# Dirty Cluster RG0 checker
# Lasted Updated: 20120818

HOST=$1
COMMUNITY=$2

if [ "$HOST" == "" ] || [ "$COMMUNITY" == "" ]; then
echo "ERROR: No host or SNMP community specified"
exit 2
fi

SNMPOUTPUT=$(snmpwalk -v 2c -c $COMMUNITY $HOST 1.3.6.1.4.1.2636.3.1.14.1.7)

echo $SNMPOUTPUT | grep "INTEGER: 2" > /dev/null
if [ $? == 0 ]; then
echo "Host $HOST is the Chassis cluster ACTIVE RE"
exit 0
fi

echo $SNMPOUTPUT | grep "No Such Object available on this agent at this OID" > /dev/null
if [ $? == 0 ]; then
echo "Host $HOST is the INACTIVE RE"
exit 2
fi

echo "WTF – Something is not right …"
exit 3
[/bash]

It checks for the “jnxRedundancyState” OID – this OID reports on RE states and is only accurate on Junos routers (e.g. M and MX series etc.).

Enjoy …

SRX Branch Chassis Cluster Ports

Here is a table of the ports that are used for chassis cluster control link and management ports on Branch SRX devices.

The quoted ports are the ‘stand alone’ non clustered port names (not node1’s port names once clustered). In a SRX cluster the PIM slots on node1 start at the last PIM slot of node0 + 1. For example, a SRX240 cluster’s node1 starts at PIM 5. It’s control link port is effectively ge-5/0/1).

Model FXP0 (Management) FXP1 (Control Link)
SRX100 fe-0/0/6 fe-0/0/7
SRX210 fe-0/0/6 fe-0/0/7
SRX220 ge-0/0/6 (> 11.0) ge-0/0/7
SRX240 ge-0/0/0 ge-0/0/1
SRX550 ge-0/0/0 ge-0/0/1
SRX650 ge-0/0/0 ge-0/0/1

 *fab0 and fab1 interfaces (Data Link) are always configurable, e.g.:

  • set interfaces fab0 fabric-options member-interfaces ge-0/0/2
  • set interfaces fab1 fabric-options member-interfaces ge-5/0/2

Juniper SRX Chassis Cluster + LACP Redundant Eth Interfaces

So a co-worker and I spent some time playing around with JunOS 11’s (I believe it came in with 11 – correct me if wrong) reth’s ability to now be LACP interfaces, as well as just plain redundant. It was not immediately clear how the switch was required to be set up in order to facilitate this new, awesome feature.

– This was used with a ex4200 virtual chassis cluster and SRX Chassis Cluster –

Here is how we got it happily working (assuming you have a chassis cluster up and running):

SRX Config:

set interfaces ge-2/0/0 gigether-options redundant-parent reth1
set interfaces ge-2/0/1 gigether-options redundant-parent reth1
set interfaces ge-2/0/2 gigether-options redundant-parent reth1
set interfaces ge-2/0/3 gigether-options redundant-parent reth1
set interfaces ge-11/0/0 gigether-options redundant-parent reth1
set interfaces ge-11/0/1 gigether-options redundant-parent reth1
set interfaces ge-11/0/2 gigether-options redundant-parent reth1
set interfaces ge-11/0/3 gigether-options redundant-parent reth1

set interfaces reth1 redundant-ether-options redundancy-group 1
set interfaces reth1 redundant-ether-options lacp passive

EX Config:

set interfaces ge-0/0/0 ether-options 802.3ad ae1
set interfaces ge-0/0/1 ether-options 802.3ad ae2
set interfaces ge-0/0/2 ether-options 802.3ad ae1
set interfaces ge-0/0/3 ether-options 802.3ad ae2

set interfaces ge-1/0/0 ether-options 802.3ad ae2
set interfaces ge-1/0/1 ether-options 802.3ad ae1
set interfaces ge-1/0/2 ether-options 802.3ad ae2
set interfaces ge-1/0/3 ether-options 802.3ad ae1

set interfaces ae1 aggregated-ether-options lacp active
set interfaces ae2 aggregated-ether-options lacp active

Now we have LACP bandwidth and redundancy – Either the switch or SRX can die, in theory.

* Have not tested the failover yet – But will before this set up goes to production – Will update the post *