QFabric Part 1 - Explained and Explored First Hand

I was lucky enough to be one of the first APAC partner engineers to get my hands on Juniper's new QFabric gigantic scalable switch technology. I have even beat some of Juniper's own SEs. In general, it rocks, but does have some features and fine tuning, this will come. This post is an introduction to QFabric, with my likes, dislikes and feature wish-list.

I would like to thanks Juniper APAC and Yeu Kuang Bin for this excellent opportunity and very knowledgable training.

Cooper with a working QFabric

What is QFabric?

The most simple explanation of QFabric I can explain is that it is basically a Juniper EX Virtual Chassis on steroids. The internal workings of the switch have been broken apart to be MUCH MORE scalable and Juniper have insured that there are no single points of failure, only selling the design with fully redundant components.

The QFabric components are:

Director Group - 2 x QFX3100 (Control Plane)

Interconnects - 2 x QFX3008-I (Backplane / Fabric)

2 REs per Interconnect

Nodes (Data Plane)

Server Groups - 1 - 2 per group

40GE DAC cable (1m,3m,5m lengths)
40GB - QSFP+ (quad small form-factor pluggable plus) - 40 gig uses MTP connector

QFabric Node Discovery

Control Plane

The control plane is discovered automatically, it depends on being configured with a pre-defined Juniper configuration in order to discover the nodes via a pre-defined method when you turn the QFX3500 into fabric mode.

Data/Fabric Plane

The fabric plan is what makes QFabric as scalable as it is. Once again a predefined HA design is supplied and the directors perform the following tasks:

Discovers, builds & Maintains Topology of the Fabric
Assembles the entire topology
Propagates path information to all entities

NOTE: Interconnects DO NOT interconnect to each other

Node Aliasing

Node aliasing allows administrators to give nodes a meaningful name and is used when talking about specific interfaces for specific nodes or node groups

Id the nodes via beaconing (the LCD screen) or serial number on chassis.
e.g. set fabric aliases node-device P6969-C NODE-0

This name is used to reference ports and assign the node to a group (discussed next)

Logical Node Groups

Node groups are used to allow the infrastructure to be divided up and allow the director to know what type of cofiguration to push to a nodes routing-engine. The local routing engine still performs some tasks, predominately to allow scale. A group can contain a maximum of 2 nodes. A group with 2 nodes is know as a redundant server group (It is a 2 node virtual chassis under the covers). Due to this, a redundant server group can have multi-chassis ae (aggregated ethernet) interfaces. There is one other type of group known as the Network node group. This group looks after all routing and l2 loop information, such as OSPF and spanning tree. All vlan routing etc. is done by these nodes.

Group Summary

Network Node Group (1 per QFabric - Max 8 nodes)
Server Group (Redundant Server Group optional - 2 nodes)

Qfabric automatically creates a redundant server group if two nodes exist in a server group (via a form of virtual chassis).

Port Referencing

Now cause each node has an 'alias' (discussed above) to reference a port in configuration you now use:

NODE_ALIAS:INT_TYPE-x/x/x.x
e.g. NODE-0:xe-0/0/1.0

Aggregated interfaces can be deployed - Across chassis in a redundant server group or on one chassis in a server group:

GROUP_NAME:ae0.0
e.g. RACK-42-1:ae0.0

QFabric can also function with port in FC and FCoE mode. There are some limitations to this feature today, but can provide an excellent mechanism to create redundant paths back through the Fabric to the SAN FC based network. This will be discussed in a dedicated post in my QFabric series.

Summary

QFabric, for a Data Center is ready today and works extremely well. It can allow a HUGE number of 10gb (and soon to be 40gb) ports to allow huge data movement around a DC at low latency. It is also effectively one single point of management for all your nodes, unless something goes wrong of course. For a campus, with end users, QFabric does not have many key features that we use today either with the MX or EX range. It could be used for large campuses as the aggregation or core (especially when more IPv4 and IPv6 routing is supported) and feed 10gb out to EX switches to provide the 'edge'. The coming 'micro' fabric is also interesting, which will allow for a more compelling footprint within a smaller data center.

Key Likes

Single switch in regards to management and functionalty

No TRILL or other L2 bridging redundancy protocols required

Ultra redundant design - Enforced by Juniper

No half way deployment, people can't go in half assed !

The simple well thought out HA deployment/design - Common install = easier to debug for JTAC / Engineers like myself
Scalability - Can see how big DCs could benefit from having 1 gigantic switch
Road map looks good - Key features and hardware are coming

Key Dislikes

AFL (Advanced Feature License) required for IPv6 (when it arrives)

PLEASE Juniper - Can we have IPv6 for free or I will never get customers to deploy it
This really frustrates me ... You may be able to tell 🙂

Limitation of 1 unit per interface

No vlan tagging and multiple units in Network Groups
Can work around by turning port into trunk and assigning multiple L3 interfaces

The need for legacy SAN infrastructure in order to use FC/FCoE (discussed in part 3)
No ability to have a full 48 Copper SFP 1gb interfaces in a node for legacy non 10gig equipment

The QFX3500 can not fit physically the SFPs in top and bottom rows
This could be handy to keep legacy equipment and as it's replaced change the SFP to a 10g SFP+

Wish List

The Micro Fabric - will allow more use cases
Full SNMP interface statistics for all nodes through the director

Currently testing this with Zenoss in the Juniper Lab - Has not worked so far
The ability to ensure node's RE's and PSU etc. are also a plus (have not tested / read the MIBs yet - so could be possible)

Be able to downgrade and system wide request system rollback from the director
Full Q-in-Q Support
Fully self contained FC/FCoE support

To Come in this series:

Part 2 - Deploying and Configuring

Part 3 - FCoE and SAN with QFabric

Part 4 - QFabric eratta (possibly - not sure yet ...)

Please note: The information presented here is from my own point of view. It is no way associated with the firm beliefs of Juniper Networks (TM) or ICT Networks (TM).

3 thoughts on “QFabric Part 1 – Explained and Explored First Hand”

Allen Baylis says:

2012-04-20 at 10:24

Like the article. Would love to speak with you

1. cooper says:
  
  2012-04-20 at 10:40
  
  Hi,
  
  Happy to chat mate. Fire me an email (me@cooperlees.com) and we can chat there or try and organise a Skype call etc.
  
  Cooper
  
Manny Oune says:

2012-04-20 at 20:07

How many servers are you planning on deploying and how many virtual machines on each? I am trying to figure out how the 8,000 entry ARP cache is going to work when I have more than 8,000 MAC-to-IP bindings in the ARP table – seems like a major screw up but have not had the opportunity t play with it…

QFabric Part 1 – Explained and Explored First Hand

By cooperlees

Related Post

3 thoughts on “QFabric Part 1 – Explained and Explored First Hand”

Leave a Reply Cancel reply

You Missed

systemd … targeting Requires=, Wants, Before= and happily ever After= again …

Book REVIEW: Linux Service Management Made Easy with systemd: Advanced techniques to effectively manage, control, and monitor Linux systems and services 1st Edition

CLI Templates for Python + Rust

Stop IPv4 Point-To-Point Addressing your Networks