Updating Juniper QFabric

The follow post shows output obtained and the  upgrade process performed recently on a clients QFabric system. This output was captured updating from 12.2X30 to 12.2X50 Junos release via a ‘Non Stop Services Upgrade’ (NSSU) method. This method basically is a very conservative approach updating redundant components one at a time.

The overall process is:

  1. Upgrade Director Group
  2. Upgrade QFabric Interconnects
  3. Upgrade each node group
    1. Network Node group (NW-NG-01)
    2. Each redundant server node group (RSNG)
    3. Each server node group (my client did not have any SNGs)

Before Upgrade Backup

All that is required to be backed up is the QFabric configuration file, everything else about the install is the QFabric standard and able to be restored using documented Juniper methods.

To backup the config log into the device and:

  1. Capture the output from ‘show configuration | no-more’

or

  1. ‘show configuration | save QFabric.conf’
    1. Remotely: scp username@x.x.x.x:/pbdata/packages/QFabric.conf

Upgrade Process with Output

Director Group Upgrade

Copy the RPM image to the director to /pbdata/packages. This process takes around 2 hours. We started at 7:15am and finished at 9:15am.

  1. scp FILE.rpm root@x.x.x.x:/pbdata/packages
  2. Log into the DG via the VIP and start the upgrade
  • request system software nonstop-upgrade director-group FILE.rpm
    • Junos looks in /pbdata/packages by default

Upgrade Output:

root@FSASYDBRDQFAB01> request system software nonstop-upgrade director-group jinstall-qfabric-12.2X50-D20.4.rpmValidating update package jinstall-qfabric-12.2X50-D20.4.rpmInstalling update package jinstall-qfabric-12.2X50-D20.4.rpmInstalling fabric images version 12.2X50-D20.4Performing cleanupPackage install completeInstalling update package jinstall-qfabric-12.2X50-D20.4.rpm on peer

Triggering Initial Stage of Fabric Manager Upgrade

Updating CCIF default image to 12.2X50-D20.4

Updating FM-0 to Junos version 12.2X50-D20.4

[Status   2012-09-24 14:43:37]: Fabric Manager: Upgrade Initial Stage started

[FM-0     2012-09-24 14:43:52]: Transferring FM-0 Mastership to LOCAL DG

[FM-0     2012-09-24 14:45:44]: Finished FM-0 Mastership switch

[NW-NG-0  2012-09-24 14:45:59]: Transferring NW-NG-0 Mastership to LOCAL DG

[NW-NG-0  2012-09-24 14:47:22]: Finished NW-NG-0 Mastership switch

[FM-0     2012-09-24 14:48:10]: Retrieving package

[FM-0     2012-09-24 14:49:13]: Retrieving package

[FM-0     2012-09-24 14:50:15]: Pushing bundle to re0

[Status   2012-09-24 14:52:03]: Load completed with 0 errors

[Status   2012-09-24 14:52:03]: Reboot is required to complete upgrade

[Status   2012-09-24 14:52:04]: Trying to Connect to Node: FM-0

[Status   2012-09-24 14:52:19]: Rebooting FM-0

[FM-0     2012-09-24 14:52:19]: Waiting for FM-0 to terminate

Starting Peer upgrade

Initiating rolling upgrade of Director peer:  version 12.2X50-D20.4

Inform CCIF regarding rolling upgrade

[Peer Update Status]: Validating install package jinstall-qfabric-12.2X50-D20.4.rpm

[Peer Update Status]: jinstall-qfabric-12.2X50.D20.4-4

[Peer Update Status]: Cleaning up node for rolling phase one upgrade

[Peer Update Status]: Director group upgrade complete

[Peer Update Status]: COMPLETED

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to return after reboot and start phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to complete phase one of rolling upgrade

[Peer Update Status]: Waiting for peer to complete phase one of rolling upgrade

[Peer Update Status]: Peer completed phase one of rolling upgrade

Setting peer DG node as the master SFC

Delaying start of local upgrade to allow peer services time to initialize [15 minutes]

Delaying start of local upgrade to allow peer services time to initialize [15 minutes]

Delaying start of local upgrade to allow peer services time to initialize [12 minutes]

Delaying start of local upgrade to allow peer services time to initialize [9 minutes]

Delaying start of local upgrade to allow peer services time to initialize [6 minutes]

Delaying start of local upgrade to allow peer services time to initialize [3 minutes]

[Peer Update Status]: Check for VMs on dg0

Triggering Final Stage of Fabric Manager Upgrade:

Updating FM-0 to Junos version 12.2X50-D20.4

[Status   2012-09-24 15:33:31]: Fabric Manager: Upgrade Final Stage started

[NW-NG-0  2012-09-24 15:33:45]: Transferring NW-NG-0 Mastership to REMOTE DG

[NW-NG-0  2012-09-24 15:35:08]: Finished NW-NG-0 Mastership switch

[Status   2012-09-24 15:35:08]: Upgrading FM-0 VM on worker DG to 12.2X50-D20.4

[DRE-0    2012-09-24 15:36:09]: Retrieving package

[DRE-0    2012-09-24 15:37:02]: ——- re0: ——-

[Status   2012-09-24 15:38:28]: Load completed with 0 errors

[Status   2012-09-24 15:38:28]: Reboot is required to complete upgrade

[DRE-0    2012-09-24 15:38:34]: Waiting for DRE-0 to terminate

[DRE-0    2012-09-24 15:38:46]: Waiting for DRE-0 to come back

[DRE-0    2012-09-24 15:42:00]: Running Uptime Test for DRE-0

[DRE-0    2012-09-24 15:42:06]: Uptime Test for DRE-0 Passed

[Status   2012-09-24 15:42:06]: DRE-0 Booted successfully

Performing post install shutdown and cleanup

Broadcast message from root (Mon Sep 24 15:42:07 2012):

The system is going down for reboot NOW!

Director group upgrade complete

Interconnect Upgrade

This process takes around an hour. It will upgrade Junos on each System Control Board (SCB) partition grabbing the code automatically via the FTP running on the active Director Group member. We observed roughly that time starting at 7:15am and finished at 9:15am.

  1. From the DG CLI initiate the config:
  • request system software nonstop-upgrade fabric FILE.rpm

Output:

[FC-0     2012-09-24 16:22:17]: Retrieving package[FC-1     2012-09-24 16:22:18]: Retrieving package[IC-F7811 2012-09-24 16:22:39]: Retrieving package[IC-F7712 2012-09-24 16:22:41]: Retrieving package[FC-0     2012-09-24 16:23:14]: Validating on re0[FC-1     2012-09-24 16:23:18]: Validating on re0[IC-F7712 2012-09-24 16:23:57]: Pushing bundle to re1

[IC-F7811 2012-09-24 16:23:58]: Pushing bundle to re1

[IC-F7712 2012-09-24 16:24:47]: Validating on re1

[IC-F7811 2012-09-24 16:24:48]: Validating on re1

[FC-0     2012-09-24 16:25:02]: Done with validate on all chassis

[FC-0     2012-09-24 16:25:02]: ——- re0: ——-

[FC-1     2012-09-24 16:25:11]: Done with validate on all chassis

[FC-1     2012-09-24 16:25:11]: ——- re0: ——-

[IC-F7712 2012-09-24 16:29:51]: Validating on re0

[IC-F7811 2012-09-24 16:30:48]: Validating on re0

[IC-F7712 2012-09-24 16:34:10]: Done with validate on all chassis

[IC-F7712 2012-09-24 16:34:10]: ——- re1: ——-

[IC-F7811 2012-09-24 16:34:20]: Done with validate on all chassis

[IC-F7811 2012-09-24 16:34:20]: ——- re1: ——-

[IC-F7712 2012-09-24 16:34:55]: Step 1 of 20 Creating temporary file system

[IC-F7712 2012-09-24 16:34:55]: Step 2 of 20 Determining installation source

[IC-F7712 2012-09-24 16:34:55]: Step 3 of 20 Processing format options

[IC-F7712 2012-09-24 16:34:55]: Step 4 of 20 Determining installation slice

[IC-F7712 2012-09-24 16:34:56]: Step 5 of 20 Creating and labeling new slices

[IC-F7811 2012-09-24 16:34:56]: Step 1 of 20 Creating temporary file system

[IC-F7712 2012-09-24 16:34:56]: Step 6 of 20 Create and mount new file system

[IC-F7811 2012-09-24 16:34:57]: Step 2 of 20 Determining installation source

[IC-F7811 2012-09-24 16:34:57]: Step 3 of 20 Processing format options

[IC-F7811 2012-09-24 16:34:57]: Step 4 of 20 Determining installation slice

[IC-F7811 2012-09-24 16:34:58]: Step 5 of 20 Creating and labeling new slices

[IC-F7811 2012-09-24 16:34:58]: Step 6 of 20 Create and mount new file system

[IC-F7712 2012-09-24 16:35:04]: Step 7 of 20 Getting OS bundles

[IC-F7712 2012-09-24 16:35:04]: Step 8 of 20 Updating recovery media

[IC-F7811 2012-09-24 16:35:07]: Step 7 of 20 Getting OS bundles

[IC-F7811 2012-09-24 16:35:07]: Step 8 of 20 Updating recovery media

[IC-F7712 2012-09-24 16:35:27]: Step 9 of 20 Extracting incoming image

[IC-F7811 2012-09-24 16:35:30]: Step 9 of 20 Extracting incoming image

[IC-F7712 2012-09-24 16:36:38]: Step 10 of 20 Unpacking OS packages

[IC-F7712 2012-09-24 16:36:41]: Step 11 of 20 Mounting jbase package

[IC-F7811 2012-09-24 16:36:42]: Step 10 of 20 Unpacking OS packages

[IC-F7811 2012-09-24 16:36:45]: Step 11 of 20 Mounting jbase package

[IC-F7712 2012-09-24 16:37:05]: Step 12 of 20 Creating base OS symbolic links

[IC-F7811 2012-09-24 16:37:09]: Step 12 of 20 Creating base OS symbolic links

[IC-F7712 2012-09-24 16:38:03]: Step 13 of 20 Creating fstab

[IC-F7712 2012-09-24 16:38:03]: Step 14 of 20 Creating new system files

[IC-F7712 2012-09-24 16:38:04]: Step 15 of 20 Adding jbundle package

[IC-F7811 2012-09-24 16:38:07]: Step 13 of 20 Creating fstab

[IC-F7811 2012-09-24 16:38:07]: Step 14 of 20 Creating new system files

[IC-F7811 2012-09-24 16:38:07]: Step 15 of 20 Adding jbundle package

[IC-F7712 2012-09-24 16:40:35]: Step 16 of 20 Backing up system data

[IC-F7811 2012-09-24 16:40:36]: Step 16 of 20 Backing up system data

[IC-F7712 2012-09-24 16:40:37]: Step 17 of 20 Setting up shared partition data

[IC-F7811 2012-09-24 16:40:37]: Step 17 of 20 Setting up shared partition data

[IC-F7712 2012-09-24 16:40:37]: Step 18 of 20 Checking package sanity in installation

[IC-F7712 2012-09-24 16:40:37]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[IC-F7811 2012-09-24 16:40:37]: Step 18 of 20 Checking package sanity in installation

[IC-F7811 2012-09-24 16:40:37]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[IC-F7712 2012-09-24 16:40:40]: Step 20 of 20 Setting da0s1 as new active partition

[IC-F7811 2012-09-24 16:40:41]: Step 20 of 20 Setting da0s1 as new active partition

[IC-F7712 2012-09-24 16:40:50]: ——- re0: ——-

[IC-F7811 2012-09-24 16:40:52]: ——- re0: ——-

[IC-F7712 2012-09-24 16:41:36]: Step 1 of 20 Creating temporary file system

[IC-F7712 2012-09-24 16:41:36]: Step 2 of 20 Determining installation source

[IC-F7712 2012-09-24 16:41:37]: Step 3 of 20 Processing format options

[IC-F7712 2012-09-24 16:41:37]: Step 4 of 20 Determining installation slice

[IC-F7712 2012-09-24 16:41:38]: Step 5 of 20 Creating and labeling new slices

[IC-F7712 2012-09-24 16:41:38]: Step 6 of 20 Create and mount new file system

[IC-F7811 2012-09-24 16:41:39]: Step 1 of 20 Creating temporary file system

[IC-F7811 2012-09-24 16:41:39]: Step 2 of 20 Determining installation source

[IC-F7811 2012-09-24 16:41:40]: Step 3 of 20 Processing format options

[IC-F7811 2012-09-24 16:41:40]: Step 4 of 20 Determining installation slice

[IC-F7811 2012-09-24 16:41:41]: Step 5 of 20 Creating and labeling new slices

[IC-F7811 2012-09-24 16:41:42]: Step 6 of 20 Create and mount new file system

[IC-F7712 2012-09-24 16:41:49]: Step 7 of 20 Getting OS bundles

[IC-F7712 2012-09-24 16:41:50]: Step 8 of 20 Updating recovery media

[IC-F7811 2012-09-24 16:41:51]: Step 7 of 20 Getting OS bundles

[IC-F7811 2012-09-24 16:41:51]: Step 8 of 20 Updating recovery media

[IC-F7712 2012-09-24 16:42:15]: Step 9 of 20 Extracting incoming image

[IC-F7811 2012-09-24 16:42:19]: Step 9 of 20 Extracting incoming image

[IC-F7712 2012-09-24 16:44:01]: Step 10 of 20 Unpacking OS packages

[IC-F7712 2012-09-24 16:44:04]: Step 11 of 20 Mounting jbase package

[IC-F7811 2012-09-24 16:44:05]: Step 10 of 20 Unpacking OS packages

[IC-F7811 2012-09-24 16:44:07]: Step 11 of 20 Mounting jbase package

[IC-F7712 2012-09-24 16:44:36]: Step 12 of 20 Creating base OS symbolic links

[IC-F7811 2012-09-24 16:44:40]: Step 12 of 20 Creating base OS symbolic links

[IC-F7712 2012-09-24 16:46:01]: Step 13 of 20 Creating fstab

[IC-F7712 2012-09-24 16:46:01]: Step 14 of 20 Creating new system files

[IC-F7712 2012-09-24 16:46:01]: Step 15 of 20 Adding jbundle package

[IC-F7811 2012-09-24 16:46:06]: Step 13 of 20 Creating fstab

[IC-F7811 2012-09-24 16:46:06]: Step 14 of 20 Creating new system files

[IC-F7811 2012-09-24 16:46:06]: Step 15 of 20 Adding jbundle package

[IC-F7712 2012-09-24 16:49:41]: Step 16 of 20 Backing up system data

[IC-F7811 2012-09-24 16:49:45]: Step 16 of 20 Backing up system data

[IC-F7811 2012-09-24 16:49:47]: Step 17 of 20 Setting up shared partition data

[IC-F7811 2012-09-24 16:49:48]: Step 18 of 20 Checking package sanity in installation

[IC-F7811 2012-09-24 16:49:48]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[IC-F7811 2012-09-24 16:49:51]: Step 20 of 20 Setting da0s1 as new active partition

[IC-F7712 2012-09-24 16:51:13]: Step 17 of 20 Setting up shared partition data

[IC-F7712 2012-09-24 16:51:14]: Step 18 of 20 Checking package sanity in installation

[IC-F7712 2012-09-24 16:51:14]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[IC-F7712 2012-09-24 16:51:17]: Step 20 of 20 Setting da0s1 as new active partition

[Status   2012-09-24 16:51:32]: Load completed with 0 errors

[Status   2012-09-24 16:51:32]: Reboot is required to complete upgrade

[Status   2012-09-24 16:51:32]: Rebooting FC-1

[FC-1     2012-09-24 16:51:33]: Waiting for FC-1 to terminate

[FC-1     2012-09-24 16:52:18]: Waiting for FC-1 to come back

[FC-1     2012-09-24 16:55:10]: Running Uptime Test for FC-1

[FC-1     2012-09-24 16:55:26]: Uptime Test for FC-1 Passed

[Status   2012-09-24 16:55:27]: FC-1 Booted successfully

[Status   2012-09-24 16:55:27]: Rebooting FC-0

[FC-0     2012-09-24 16:55:27]: Waiting for FC-0 to terminate

[FC-0     2012-09-24 16:56:12]: Waiting for FC-0 to come back

[FC-0     2012-09-24 16:59:06]: Running Uptime Test for FC-0

[FC-0     2012-09-24 16:59:22]: Uptime Test for FC-0 Passed

[Status   2012-09-24 16:59:22]: FC-0 Booted successfully

[Status   2012-09-24 16:59:22]: Rebooting IC-F7811

[IC-F7811 2012-09-24 16:59:28]: Waiting for IC-F7811 to terminate

[IC-F7811 2012-09-24 16:59:59]: Waiting for IC-F7811 to come back

[IC-F7811 2012-09-24 17:06:45]: Running Uptime Test for IC-F7811

[IC-F7811 2012-09-24 17:07:34]: Waiting for FM to be ready

[IC-F7811 2012-09-24 17:13:09]: Performing post-boot Health-Check

[IC-F7811 2012-09-24 17:14:24]: Waiting for routes to sync

[IC-F7811 2012-09-24 17:14:32]: Uptime Test for IC-F7811 Passed

[Status   2012-09-24 17:14:32]: IC-F7811 Booted successfully

[Status   2012-09-24 17:14:32]: Rebooting IC-F7712

[IC-F7712 2012-09-24 17:14:34]: Waiting for IC-F7712 to terminate

[IC-F7712 2012-09-24 17:15:07]: Waiting for IC-F7712 to come back

[IC-F7712 2012-09-24 17:22:03]: Running Uptime Test for IC-F7712

[IC-F7712 2012-09-24 17:22:47]: Waiting for FM to be ready

[IC-F7712 2012-09-24 17:29:28]: Performing post-boot Health-Check

[IC-F7712 2012-09-24 17:30:43]: Waiting for routes to sync

[IC-F7712 2012-09-24 17:30:49]: Uptime Test for IC-F7712 Passed

[Status   2012-09-24 17:30:50]: IC-F7712 Booted successfully

Success

Node Group Upgrades

The NWNG took around an hour (for 4 nodes) and around 40 minutes for a RSNG. This process upgrades a node at a time in the group and updates both slices. Currently there is no command to verify each slice’s version, it is a known issue.

Node Groups tested were 1 Network node group and 2 RSNGs:

  • NW-NG-0
  • RSNG01
  • RSNG02
  1. From the DG CLI initiate the config:
  • request system software nonstop-upgrade node-group GROUP-NAME FILE.rpm

Output:

root@FSASYDBRDQFAB01> …0-D20.4.rpm node-group NW-NG-0Upgrading target(s): NW-NG-0[NW-NG-0  2012-09-24 17:33:25]: Starting with package ftp://169.254.0.3/pub/images/12.2X50-D20.4/jinstall-qfx.tgz[NW-NG-0  2012-09-24 17:33:25]: Retrieving package[NW-NG-0  2012-09-24 17:34:47]: Pushing bundle to P6172-C[NW-NG-0  2012-09-24 17:35:20]: Pushing bundle to P6136-C[NW-NG-0  2012-09-24 17:35:53]: Pushing bundle to fpc4

[NW-NG-0  2012-09-24 17:36:27]: Pushing bundle to fpc5

[NW-NG-0  2012-09-24 17:36:59]: P6172-C: Validate package…

[NW-NG-0  2012-09-24 17:43:31]: P6136-C: Validate package…

[NW-NG-0  2012-09-24 17:43:31]: fpc4: Validate package…

[NW-NG-0  2012-09-24 17:43:41]: fpc5: Validate package…

[NW-NG-0  2012-09-24 17:43:41]: ——- P6172-C ——-

[NW-NG-0  2012-09-24 17:44:17]: Step 1 of 20 Creating temporary file system

[NW-NG-0  2012-09-24 17:44:17]: Step 2 of 20 Determining installation source

[NW-NG-0  2012-09-24 17:44:18]: Step 3 of 20 Processing format options

[NW-NG-0  2012-09-24 17:44:18]: Step 4 of 20 Determining installation slice

[NW-NG-0  2012-09-24 17:44:18]: Step 5 of 20 Creating and labeling new slices

[NW-NG-0  2012-09-24 17:44:19]: Step 6 of 20 Create and mount new file system

[NW-NG-0  2012-09-24 17:44:27]: Step 7 of 20 Getting OS bundles

[NW-NG-0  2012-09-24 17:44:27]: Step 8 of 20 Updating recovery media

[NW-NG-0  2012-09-24 17:44:48]: Step 9 of 20 Extracting incoming image

[NW-NG-0  2012-09-24 17:46:02]: Step 10 of 20 Unpacking OS packages

[NW-NG-0  2012-09-24 17:46:07]: Step 11 of 20 Mounting jbase package

[NW-NG-0  2012-09-24 17:46:33]: Step 12 of 20 Creating base OS symbolic links

[NW-NG-0  2012-09-24 17:47:33]: Step 13 of 20 Creating fstab

[NW-NG-0  2012-09-24 17:47:33]: Step 14 of 20 Creating new system files

[NW-NG-0  2012-09-24 17:47:34]: Step 15 of 20 Adding jbundle package

[NW-NG-0  2012-09-24 17:50:07]: Step 16 of 20 Backing up system data

[NW-NG-0  2012-09-24 17:50:08]: Step 17 of 20 Setting up shared partition data

[NW-NG-0  2012-09-24 17:50:09]: Step 18 of 20 Checking package sanity in installation

[NW-NG-0  2012-09-24 17:50:09]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[NW-NG-0  2012-09-24 17:50:12]: Step 20 of 20 Setting da0s2 as new active partition

[NW-NG-0  2012-09-24 17:50:23]: ——- P6136-C ——-

[NW-NG-0  2012-09-24 17:50:23]: Step 1 of 20 Creating temporary file system

[NW-NG-0  2012-09-24 17:50:23]: Step 2 of 20 Determining installation source

[NW-NG-0  2012-09-24 17:50:23]: Step 3 of 20 Processing format options

[NW-NG-0  2012-09-24 17:50:23]: Step 4 of 20 Determining installation slice

[NW-NG-0  2012-09-24 17:50:23]: Step 5 of 20 Creating and labeling new slices

[NW-NG-0  2012-09-24 17:50:23]: Step 6 of 20 Create and mount new file system

[NW-NG-0  2012-09-24 17:50:23]: Step 7 of 20 Getting OS bundles

[NW-NG-0  2012-09-24 17:50:23]: Step 8 of 20 Updating recovery media

[NW-NG-0  2012-09-24 17:50:23]: Step 9 of 20 Extracting incoming image

[NW-NG-0  2012-09-24 17:50:23]: Step 10 of 20 Unpacking OS packages

[NW-NG-0  2012-09-24 17:50:23]: Step 11 of 20 Mounting jbase package

[NW-NG-0  2012-09-24 17:50:23]: Step 12 of 20 Creating base OS symbolic links

[NW-NG-0  2012-09-24 17:50:23]: Step 13 of 20 Creating fstab

[NW-NG-0  2012-09-24 17:50:23]: Step 14 of 20 Creating new system files

[NW-NG-0  2012-09-24 17:50:23]: Step 15 of 20 Adding jbundle package

[NW-NG-0  2012-09-24 17:50:23]: Step 16 of 20 Backing up system data

[NW-NG-0  2012-09-24 17:50:23]: Step 17 of 20 Setting up shared partition data

[NW-NG-0  2012-09-24 17:50:23]: Step 18 of 20 Checking package sanity in installation

[NW-NG-0  2012-09-24 17:50:23]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[NW-NG-0  2012-09-24 17:50:23]: Step 20 of 20 Setting da0s2 as new active partition

[NW-NG-0  2012-09-24 17:50:27]: Step 1 of 20 Creating temporary file system

[NW-NG-0  2012-09-24 17:50:27]: Step 2 of 20 Determining installation source

[NW-NG-0  2012-09-24 17:50:27]: Step 3 of 20 Processing format options

[NW-NG-0  2012-09-24 17:50:27]: Step 4 of 20 Determining installation slice

[NW-NG-0  2012-09-24 17:50:27]: Step 5 of 20 Creating and labeling new slices

[NW-NG-0  2012-09-24 17:50:27]: Step 6 of 20 Create and mount new file system

[NW-NG-0  2012-09-24 17:50:27]: Step 7 of 20 Getting OS bundles

[NW-NG-0  2012-09-24 17:50:27]: Step 8 of 20 Updating recovery media

[NW-NG-0  2012-09-24 17:50:27]: Step 9 of 20 Extracting incoming image

[NW-NG-0  2012-09-24 17:50:27]: Step 10 of 20 Unpacking OS packages

[NW-NG-0  2012-09-24 17:50:27]: Step 11 of 20 Mounting jbase package

[NW-NG-0  2012-09-24 17:50:27]: Step 12 of 20 Creating base OS symbolic links

[NW-NG-0  2012-09-24 17:50:27]: Step 13 of 20 Creating fstab

[NW-NG-0  2012-09-24 17:50:27]: Step 14 of 20 Creating new system files

[NW-NG-0  2012-09-24 17:50:27]: Step 15 of 20 Adding jbundle package

[NW-NG-0  2012-09-24 17:50:27]: Step 16 of 20 Backing up system data

[NW-NG-0  2012-09-24 17:50:27]: Step 17 of 20 Setting up shared partition data

[NW-NG-0  2012-09-24 17:50:27]: Step 18 of 20 Checking package sanity in installation

[NW-NG-0  2012-09-24 17:50:27]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[NW-NG-0  2012-09-24 17:50:27]: Step 20 of 20 Setting da0s2 as new active partition

[NW-NG-0  2012-09-24 17:50:27]: Step 1 of 20 Creating temporary file system

[NW-NG-0  2012-09-24 17:50:27]: Step 2 of 20 Determining installation source

[NW-NG-0  2012-09-24 17:50:27]: Step 3 of 20 Processing format options

[NW-NG-0  2012-09-24 17:50:27]: Step 4 of 20 Determining installation slice

[NW-NG-0  2012-09-24 17:50:27]: Step 5 of 20 Creating and labeling new slices

[NW-NG-0  2012-09-24 17:50:27]: Step 6 of 20 Create and mount new file system

[NW-NG-0  2012-09-24 17:50:27]: Step 7 of 20 Getting OS bundles

[NW-NG-0  2012-09-24 17:50:27]: Step 8 of 20 Updating recovery media

[NW-NG-0  2012-09-24 17:50:27]: Step 9 of 20 Extracting incoming image

[NW-NG-0  2012-09-24 17:50:27]: Step 10 of 20 Unpacking OS packages

[NW-NG-0  2012-09-24 17:50:27]: Step 11 of 20 Mounting jbase package

[NW-NG-0  2012-09-24 17:50:27]: Step 12 of 20 Creating base OS symbolic links

[NW-NG-0  2012-09-24 17:50:27]: Step 13 of 20 Creating fstab

[NW-NG-0  2012-09-24 17:50:27]: Step 14 of 20 Creating new system files

[NW-NG-0  2012-09-24 17:50:27]: Step 15 of 20 Adding jbundle package

[NW-NG-0  2012-09-24 17:50:27]: Step 16 of 20 Backing up system data

[NW-NG-0  2012-09-24 17:50:27]: Step 17 of 20 Setting up shared partition data

[NW-NG-0  2012-09-24 17:50:27]: Step 18 of 20 Checking package sanity in installation

[NW-NG-0  2012-09-24 17:50:27]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[NW-NG-0  2012-09-24 17:50:27]: Step 20 of 20 Setting da0s2 as new active partition

[NW-NG-0  2012-09-24 17:50:27]: Starting with package ftp://169.254.0.3/pub/images/12.2X50-D20.4/jinstall-dc-re.tgz

[NW-NG-0  2012-09-24 17:50:27]: Retrieving package

[NW-NG-0  2012-09-24 17:51:35]: Pushing bundle to re0

[NW-NG-0  2012-09-24 17:52:09]: re0: Validate package…

[NW-NG-0  2012-09-24 17:53:56]: re1: Validate package…

[NW-NG-0  2012-09-24 17:55:53]: Rebooting Backup RE

[NW-NG-0  2012-09-24 17:59:56]: Initiating Chassis In-Service-Upgrade

[NW-NG-0  2012-09-24 18:00:16]: Upgrading group: 2 fpc: 2

[NW-NG-0  2012-09-24 18:10:08]: Upgrade complete for group:2

[NW-NG-0  2012-09-24 18:10:08]: Upgrading group: 3 fpc: 3

[NW-NG-0  2012-09-24 18:19:58]: Upgrade complete for group:3

[NW-NG-0  2012-09-24 18:19:58]: Upgrading group: 4 fpc: 4

[NW-NG-0  2012-09-24 18:29:45]: Upgrade complete for group:4

[NW-NG-0  2012-09-24 18:29:45]: Upgrading group: 5 fpc: 5

[NW-NG-0  2012-09-24 18:39:32]: Upgrade complete for group:5

[NW-NG-0  2012-09-24 18:39:32]: Finished processing all upgrade groups, last group :5

[NW-NG-0  2012-09-24 18:39:37]: Preparing for Switchover

[NW-NG-0  2012-09-24 18:39:54]: Switchover Completed

[Status   2012-09-24 18:39:54]: Upgrade completed with 0 errors

Success

root@FSASYDBRDQFAB01> …0-D20.4.rpm node-group RSNG01

Upgrading target(s): RSNG01

[RSNG01   2012-09-25 11:44:47]: Starting with package ftp://169.254.0.3/pub/images/12.2X50-D20.4/jinstall-qfx.tgz

[RSNG01   2012-09-25 11:44:47]: Retrieving package

[RSNG01   2012-09-25 11:46:55]: Pushing bundle to P6167-C

[RSNG01   2012-09-25 11:47:27]: P6167-C: Validate package…

[RSNG01   2012-09-25 11:53:38]: P6185-C: Validate package…

[RSNG01   2012-09-25 11:54:16]: ——- P6167-C ——-

[RSNG01   2012-09-25 11:54:53]: Step 1 of 20 Creating temporary file system

[RSNG01   2012-09-25 11:54:53]: Step 2 of 20 Determining installation source

[RSNG01   2012-09-25 11:54:54]: Step 3 of 20 Processing format options

[RSNG01   2012-09-25 11:54:54]: Step 4 of 20 Determining installation slice

[RSNG01   2012-09-25 11:54:55]: Step 5 of 20 Creating and labeling new slices

[RSNG01   2012-09-25 11:54:55]: Step 6 of 20 Create and mount new file system

[RSNG01   2012-09-25 11:55:03]: Step 7 of 20 Getting OS bundles

[RSNG01   2012-09-25 11:55:03]: Step 8 of 20 Updating recovery media

[RSNG01   2012-09-25 11:55:25]: Step 9 of 20 Extracting incoming image

[RSNG01   2012-09-25 11:56:40]: Step 10 of 20 Unpacking OS packages

[RSNG01   2012-09-25 11:56:45]: Step 11 of 20 Mounting jbase package

[RSNG01   2012-09-25 11:57:09]: Step 12 of 20 Creating base OS symbolic links

[RSNG01   2012-09-25 11:58:10]: Step 13 of 20 Creating fstab

[RSNG01   2012-09-25 11:58:11]: Step 14 of 20 Creating new system files

[RSNG01   2012-09-25 11:58:11]: Step 15 of 20 Adding jbundle package

[RSNG01   2012-09-25 12:00:48]: Step 16 of 20 Backing up system data

[RSNG01   2012-09-25 12:00:50]: Step 17 of 20 Setting up shared partition data

[RSNG01   2012-09-25 12:00:50]: Step 18 of 20 Checking package sanity in installation

[RSNG01   2012-09-25 12:00:50]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[RSNG01   2012-09-25 12:00:54]: Step 20 of 20 Setting da0s2 as new active partition

[RSNG01   2012-09-25 12:01:05]: ——- P6185-C – master ——-

[RSNG01   2012-09-25 12:01:05]: Step 1 of 20 Creating temporary file system

[RSNG01   2012-09-25 12:01:05]: Step 2 of 20 Determining installation source

[RSNG01   2012-09-25 12:01:05]: Step 3 of 20 Processing format options

[RSNG01   2012-09-25 12:01:05]: Step 4 of 20 Determining installation slice

[RSNG01   2012-09-25 12:01:05]: Step 5 of 20 Creating and labeling new slices

[RSNG01   2012-09-25 12:01:05]: Step 6 of 20 Create and mount new file system

[RSNG01   2012-09-25 12:01:05]: Step 7 of 20 Getting OS bundles

[RSNG01   2012-09-25 12:01:05]: Step 8 of 20 Updating recovery media

[RSNG01   2012-09-25 12:01:05]: Step 9 of 20 Extracting incoming image

[RSNG01   2012-09-25 12:01:05]: Step 10 of 20 Unpacking OS packages

[RSNG01   2012-09-25 12:01:05]: Step 11 of 20 Mounting jbase package

[RSNG01   2012-09-25 12:01:05]: Step 12 of 20 Creating base OS symbolic links

[RSNG01   2012-09-25 12:01:05]: Step 13 of 20 Creating fstab

[RSNG01   2012-09-25 12:01:05]: Step 14 of 20 Creating new system files

[RSNG01   2012-09-25 12:01:05]: Step 15 of 20 Adding jbundle package

[RSNG01   2012-09-25 12:01:05]: Step 16 of 20 Backing up system data

[RSNG01   2012-09-25 12:01:05]: Step 17 of 20 Setting up shared partition data

[RSNG01   2012-09-25 12:01:05]: Step 18 of 20 Checking package sanity in installation

[RSNG01   2012-09-25 12:01:05]: Step 19 of 20 Unmounting and cleaning up temporary file systems

[RSNG01   2012-09-25 12:01:05]: Step 20 of 20 Setting da0s2 as new active partition

[RSNG01   2012-09-25 12:01:51]: Rebooting Backup RE

[RSNG01   2012-09-25 12:01:51]: ——- Rebooting P6167-C ——-

[RSNG01   2012-09-25 12:08:49]: Initiating Chassis In-Service-Upgrade

[RSNG01   2012-09-25 12:09:09]: Upgrading group: 0 fpc: 0

[RSNG01   2012-09-25 12:11:15]: Upgrade complete for group:0

[RSNG01   2012-09-25 12:11:15]: Upgrading group: 1 fpc: 1

[RSNG01   2012-09-25 12:13:20]: Upgrade complete for group:1

[RSNG01   2012-09-25 12:13:20]: Finished processing all upgrade groups, last group :1

[RSNG01   2012-09-25 12:13:24]: Preparing for Switchover

[RSNG01   2012-09-25 12:14:15]: Switchover Completed

[Status   2012-09-25 12:14:15]: Upgrade completed with 0 errors

Success

Conclusion

The NSSU QFabric upgrade is a very simple and well polished process. Apart from being very time consuming, it’s great and I really like how it’s been designed and implemented. It’s quite verbose and keeps the operator well informed, which I like, loving knowing what is actually going on. I also like (some may argue this is bad) the automatic upgrade of each SCB on the Interconnects and each slice on the nodes, saving that extra step post upgrade, but does make rollback harder.

Well done Juniper, this is another great part of the QFabric Solution!

P.s. Just give me a ssh client and automatic system archival.

QFabric Part 2 – Lets get Down and Dirty Deploying and Configuring …

Juniper is selling QFabric as a bundle. Due to this the install has been templated and will be similar in regards to the control plane and getting the fabric up and ready to be configured for each target environment. Every QFabric bundle today must include Juniper Professional Services. Hopefully in the future I (and other partner engineers) will be seen as smart enough to do a QFabric install without Juniper’s assistance. I think I could manage it :). Here is the procedure that you and your friendly Juniper Professional Service engineer will complete to get QFabric up and running.
Building the EX4200 VCs for Control Plane
Juniper have extensive configuration documentation with example configurations for the Control Plane for the QFabric components. Please read Juniper’s article “Configuring the Virtual Chassis for the QFabric Switch Control Plane” for instructions on building the control plane infrastructure. I will not go into specifics in this blog post for the control plane.
Initial QFabric Components Deployment
This is my recollection and notes taken during the demonstration and explanation from Juniper’s current most experienced QFabric installer in APAC of the basic process of getting your QFabric up and running.
  1. Check BOM against what’s been received and power all equipment and test for hardware issues
    1. Also ensure directors and interconnects are the same version (should not be a problem yet but as ‘newer’ builds come old stock might pop up)
  2. Build and power ex4200 VCs for control plane
    1. I would recommended to upgrade to the JTAC recommended version of Junos on your 4200s
  3. Patch Up directors into control plane VCs and boot the desired ‘master director’
  4. Complete the console initialisation and then after ~60 seconds boot the slave and complete it’s initial configuration
  5. Patch directors into correct control ports and boot
  6. Turn each node into ‘fabric mode’
  7. Patch into each interconnect and boot each node
    1. The directors will adjust the version of Junos if required on the QFX3500 node
  8. You now have a functional QFabric and can now beging to alias nodes and add them to network/server groups

Configuration (all centrally from the Director)

To build a new Fabric you need to

Create aliases for nodes

  • set fabric aliases node-device SERIAL ALIAS_NAME
Create node groups
Always have 1 network-domain and 1 server group (max 2 nodes per server group making it a redundant server group)
  • set fabric resources node-group NW-NG-0 network-domain
  • set fabric resources node-group NW-NG-0 node-device ALIAS_NAME_X
  • set fabric resources node-group PRON-NG node-device PRON_SW1
Further configuration is ‘like’ a normal EX style configuration, but using the new interface names, for example:
Interface: NODE_ALIAS:xe-0/0/1.0
Aggregated Interface: NODE_GROUP:ae0.0
Handy Debug Commands
  • show fabric administration inventory director-group status all
    • See the directors status and who is master
  • show fabric administration inventory [terse]
    • Shows all the hardware the directors have found and are including in the QFabric
  • show chassis fabric connectivity
    • Shows the connectivity through the interconnects to each nodes
  • show fabric aliases
    • See the serial to alias mappings
  • show fabric inventory
Checking VLANS, the ethernet-switching table etc. commands are all identical to the Juniper EX Switch family.
Power On Sequence
  1. ex4200 Control Plane VCs
  2. QFabric Interconnects
  3. Director Master
    1. Election of master is based on uptime. Wait for ~60 to boot secondary director node
  4. Nodes
    1. I have not tested this, but I would power the network group first, with the members I would prefer to be the masters of the ‘vc’ first (remember each group with multiple members is an incarnation of VC – same rules apply)
Extra Functions

Node Replacement

Replacing a node, and keeping the configuration is EXTREMELY easy due to the ‘replace pattern’ feature of Junos.

  • Repatch cables
  • replace pattern OLD_SERIAL with NEW_SERIAL
  • commit

QFabric Part 1 – Explained and Explored First Hand

I was lucky enough to be one of the first APAC partner engineers to get my hands on Juniper’s new QFabric gigantic scalable switch technology. I have even beat some of Juniper’s own SEs. In general, it rocks, but does have some features and fine tuning, this will come. This post is an introduction to QFabric, with my likes, dislikes and feature wish-list.

I would like to thanks Juniper APAC and Yeu Kuang Bin for this excellent opportunity and very knowledgable training.

Cooper with a working QFabric

What is QFabric?

The most simple explanation of QFabric I can explain is that it is basically a Juniper EX Virtual Chassis on steroids. The internal workings of the switch have been broken apart to be MUCH MORE scalable and Juniper have insured that there are no single points of failure, only selling the design with fully redundant components.

The QFabric components are:

  • Director Group – 2 x QFX3100 (Control Plane)

  • Interconnects – 2 x QFX3008-I (Backplane / Fabric)
    • 2 REs per Interconnect

  • Nodes (Data Plane)
    • Server Groups – 1 – 2 per group

40GE DAC cable (1m,3m,5m lengths)
40GB – QSFP+ (quad small form-factor pluggable plus) – 40 gig uses MTP connector

QFabric Node Discovery

Control Plane

The control plane is discovered automatically, it depends on being configured with a pre-defined Juniper configuration in order to discover the nodes via a pre-defined method when you turn the QFX3500 into fabric mode.

Data/Fabric Plane

The fabric plan is what makes QFabric as scalable as it is. Once again a predefined HA design is supplied and the directors perform the following tasks:

  1. Discovers, builds & Maintains Topology of the Fabric
  2. Assembles the entire topology
  3. Propagates path information to all entities
NOTE: Interconnects DO NOT interconnect to each other
Node Aliasing
Node aliasing allows administrators to give nodes a meaningful name and is used when talking about specific interfaces for specific nodes or node groups
  • Id the nodes via beaconing (the LCD screen) or serial number on chassis.
  • e.g. set fabric aliases node-device P6969-C NODE-0
    • This name is used to reference ports and assign the node to a group (discussed next)
Logical Node Groups
Node groups are used to allow the infrastructure to be divided up and allow the director to know what type of cofiguration to push to a nodes routing-engine. The local routing engine still performs some tasks, predominately to allow scale. A group can contain a maximum of 2 nodes. A group with 2 nodes is know as a redundant server group (It is a 2 node virtual chassis under the covers). Due to this, a redundant server group can have multi-chassis ae (aggregated ethernet) interfaces. There is one other type of group known as the Network node group. This group looks after all routing and l2 loop information, such as OSPF and spanning tree. All vlan routing etc. is done by these nodes.
Group Summary
  1. Network Node Group (1 per QFabric – Max 8 nodes)
  2. Server Group (Redundant Server Group optional – 2 nodes)
    1. Qfabric automatically creates a redundant server group if two nodes exist in a server group (via a form of virtual chassis).
Port Referencing
Now cause each node has an ‘alias’ (discussed above) to reference a port in configuration you now use:
  • NODE_ALIAS:INT_TYPE-x/x/x.x
  • e.g. NODE-0:xe-0/0/1.0

Aggregated interfaces can be deployed – Across chassis in a redundant server group or on one chassis in a server group:

  • GROUP_NAME:ae0.0
  • e.g. RACK-42-1:ae0.0
QFabric can also function with port in FC and FCoE mode. There are some limitations to this feature today, but can provide an excellent mechanism to create redundant paths back through the Fabric to the SAN FC based network. This will be discussed in a dedicated post in my QFabric series.
Summary
QFabric, for a Data Center is ready today and works extremely well. It can allow a HUGE number of 10gb (and soon to be 40gb) ports to allow huge data movement around a DC at low latency. It is also effectively one single point of management for all your nodes, unless something goes wrong of course. For a campus, with end users, QFabric does not have many key features that we use today either with the MX or EX range. It could be used for large campuses as the aggregation or core (especially when more IPv4 and IPv6 routing is supported) and feed 10gb out to EX switches to provide the ‘edge’. The coming ‘micro’ fabric is also interesting, which will allow for a more compelling footprint within a smaller data center.
Key Likes
  • Single switch in regards to management and functionalty
    • No TRILL or other L2 bridging redundancy protocols required
  • Ultra redundant design – Enforced by Juniper
    • No half way deployment, people can’t go in half assed !
  • The simple well thought out HA deployment/design – Common install = easier to debug for JTAC / Engineers like myself
  • Scalability – Can see how big DCs could benefit from having 1 gigantic switch
  • Road map looks good – Key features and hardware are coming
Key Dislikes
  • AFL (Advanced Feature License) required for IPv6 (when it arrives)
    • PLEASE Juniper – Can we have IPv6 for free or I will never get customers to deploy it
    • This really frustrates me … You may be able to tell 🙂
  • Limitation of 1 unit per interface
    • No vlan tagging and multiple units in Network Groups
    • Can work around by turning port into trunk and assigning multiple L3 interfaces
  • The need for legacy SAN infrastructure in order to use FC/FCoE (discussed in part 3)
  • No ability to have a full 48 Copper SFP 1gb interfaces in a node for legacy non 10gig equipment
    • The QFX3500 can not fit physically the SFPs in top and bottom rows
    • This could be handy to keep legacy equipment and as it’s replaced change the SFP to a 10g SFP+
Wish List
  • The Micro Fabric – will allow more use cases
  • Full SNMP interface statistics for all nodes through the director
    • Currently testing this with Zenoss in the Juniper Lab – Has not worked so far
    • The ability to ensure node’s RE’s and PSU etc. are also a plus (have not tested / read the MIBs yet – so could be possible)
  • Be able to downgrade and system wide request system rollback from the director
  • Full Q-in-Q Support
  • Fully self contained FC/FCoE support
To Come in this series:
Part 2 – Deploying and Configuring
Part 3 – FCoE and SAN with QFabric
Part 4 – QFabric eratta (possibly – not sure yet …)

Please note: The information presented here is from my own point of view. It is no way associated with the firm beliefs of Juniper Networks (TM) or ICT Networks (TM).