HP’s FlexFabric

There were quite a few announcements this week at the HP Technology Forum in Vegas.  Several of these announcements were extremely interesting, of these the ones that resonated the most with me were:

Superdome 2:

I’m not familiar with the Superdome 1 nor am I in any way an expert on non x86 architectures.  In fact that’s exactly what struck me as excellent about this product announcement.  It allows the mission critical servers that a company chooses to, or must run on non x86 hardware to run right alongside the more common x86 architecture in the same chassis.  This further consolidates the datacenter and reduces infrastructure for customers with mixed environments, of which there are many.  While there is a current push in some customers to migrate all data center applications onto x86 based platforms, this is not: fast, cheap, or good for every use case.  Superdome 2 provides a common infrastructure for both the mission critical applications and the x86 based applications.

For a more technical description see Kevin Houston's Superdome 2 blog: http://bladesmadesimple.com/2010/04/its-a-bird-its-a-plane-its-superdome-2-on-a-blade-server/.

Note: As stated I’m no expert in this space and I have no technical knowledge of the Superdome platform, conceptually it makes a lot of sense and it seems like a move in the right direction.

Common Infrastructure:

There was a lot of talk in some of the key notes about common look feel and infrastructure of the separate HP systems (storage, servers, etc.)  At first I laughed this off as a ‘who cares’ but then I started to think about it.  If HP takes this message seriously and standardizes rail kits, cable management, components (where possible), etc. this will have big benefits for administration and deployment of equipment.

If you’ve never done a good deal of racking/stacking of data center gear you may not see the value here, but I spent a lot of time on the integration side with this as part of my job.  Within a single vendor (or sometimes product line) rail kits for server/storage, rack mounting hardware, etc can all be different.  This adds time and complexity to integrating systems and can sometimes lead to less than ideal systems.  For example the first vBlock I helped a partner configure (for demo purposes only) had the two UCS systems stacked on top of one another on the bottom of the rack with no mounting hardware.  The reason for this was the EMC racks being used had different rail mounts than the UCS system was designed for.  Issues like this can cause problems and delays, especially when the people in charge of infrastructure aren’t properly engaged during purchasing (very common.)

Overall I can see this as a very good thing for the end user.

HP FlexFabric

This is the piece that really grabbed my attention while watching the constant Twitter stream of HP announcements.  HP FlexFabric brings network consolidation to the HP blade chassis.  I specifically say network consolidation, because HP got this piece right.  Yes it does FCoE, but that doesn’t mean you have to.  FlexFabric provides the converged network tools to provide any protocol you want over 10GE to the blades and split that out to separate networks at a chassis level.  Here’s a picture of the switch from Kevin Houston’s blog: http://bladesmadesimple.com/2010/06/first-look-hps-new-blade-servers-and-converged-switch-hptf/.

HP Virtual Connect FlexFabric 10Gb/24-Port Module

The first thing to note when looking at this device is that all the front end uplink ports look the same, so how do they split out Fibre Channel and Ethernet?  The answer is Qlogic (the manufacturer of the switch) has been doing some heavy lifting on the engineering side.  They’ve designed the front end ports to support the optics for either Fibre Channel or 10GE.  This means you’ve got flexibility in how you use your bandwidth.  The ability to do this is an industry first, although the Cisco Nexus 5000 hardware ASIC is capable and has been since FCS it was implemented on a per-module basis rather than per-port basis like this switch. 

The next piece that was quite interesting and really provides flexibility and choice to the HP FlexFabric concept is their decision to use Emulex’s OneConnect adapter as the LAN on Motherboard (LOM.)  This was a very smart decision by HP.  Emulex’s OneConnect is a product that has impressed me from square one, it shows a traditionally Fibre Channel company embracing the fact that Ethernet is the future of storage but not locking the decision into an Upper Layer protocol (ULP.)  OneConnect provides 10GE connectivity, TCP offload, iSCSI offload/boot, and FCoE capability all on the same card, now that’s a converged network!  HP seems to have seen the value there as well and built this into the system board. 

Take a step back and soak that in, LOM has been owned by Intel, Broadcom, and other traditional NIC vendors since the beginning.  Emulex until last year was looked at as one of two solid FC HBA vendors.  As of this week HP announced the ousting of the traditional NIC vendor for a traditional FC vendor on their system board.  That’s a big win for Emulex.  Kudos to Emulex for the technology (and business decisions behind it) and to HP for recognizing that value.

Looking a little deeper the next big piece of this overall architecture is that the whole FlexFabric system supports HP’s FlexConnect technology which allows a server admin to carve up a single physical 10GE link into four logical links which are presented to the OS as individual NICs.

The only drawback I see to the FlexFabric picture is the fact that FCoE is only used within the chassis and split into separate networks from there.  This can definitely increase the required infrastructure depending on the architecture.  I’ll wait to go to deep into that until I hear a few good lines of thinking on why that direction was taken.

Summary:

HP had a strong week in Vegas, these were only a few of the announcements, several others including mind blowing stuff from HP labs (start protecting John Conner now) can be found on blogs and HP’s website.  Of all of the announcements FlexFabric was the one that really caught my attention.  It embraces the idea of I/O consolidation without clinging to FCoE as the only way to do it and it greatly increases the competitive landscape in that market which always benefits the end-user/customer.

Comments, corrections, bitches moans, gripes and complaints all welcome.

FCoE multi-hop; Do you Care?

There is a lot of discussion in the industry around FCoE’s current capabilities, and specifically around the ability to perform multi-hop transmission of FCoE frames and the standards required to do so.  A recent discussion between Brad Hedlund at Cisco and Ken Henault at HP (http://bit.ly/9Kj7zP) prompted me to write this post.  Ken proposes that FCoE is not quite ready and Brad argues that it is. 

When looking at this discussion remember that Cisco has had FCoE products shipping for about 2 years, and has a robust product line of devices with FCoE support including: UCS, Nexus 5000, Nexus 4000 and Nexus 2000, with more products on the road map for launch this year.  No other switching vendor has this level of current commitment to FCoE.  For any vendor with a less robust FCoE portfolio it makes no sense to drive FCoE sales and marketing at this point and so you will typically find articles and blogs like the one mentioned above.  The one quote from that blog that sticks out in my mind is:

“Solutions like HP’s upcoming FlexFabric can take advantage of FCoE to reduce complexity at the network edge, without requiring a major network upgrades or changes to the LAN and SAN before the standards are finalized.”

If you read between the lines here it would be easy to take this as ‘FCoE isn’t ready until we are.’  This is not unusual and if you take a minute to search through articles about FCoE over the last 2-3 years you’ll find that Cisco has been a big endorser of the protocol throughout (because they actually had a product to sell) and other vendors become less and less anti-FCoE as they announce FCoE products.

It’s also important to note that Cisco isn’t the only vendor out there embracing FCoE: NetApp has been shipping native FCoE storage controllers for some time, EMC has them road mapped for the very near future, Qlogic is shipping a 2nd generation of Converged Network adapter, and Emulex has fully embraced 10Gig Ethernet as the way forward with their OneConnect adapter (10GE, iSCSI, FCoE all in one card.)  Additionally support for FCoE switching of native Fibre Channel storage is widely supported by the storage community.

Fibre Channel over Ethernet (FCoE) is defined in IEEE FC-BB5 and requires the switches it traverses to support the IEEE Data Center Bridging (DCB)standards for proper traffic treatment on the network.  For more information on FCoE or DCB see my previous posts on the subjects (FCoE: http://www.definethecloud.net/?p=80, DCB: http://www.definethecloud.net/?p=31.)

DCB Has four major components, and the one in question in the above article is Quantized Congestion Notification (QCN) which the article states is required for multi-hop FCoE.  QCN is basically a regurgitation of FECN and BECN from frame relay.  It allows a switch to monitor it’s buffers and push congestion to the edge rather than clog the core. In the comments Brad correctly states that QCN is not required for FCoE, the reason for this is that Fibre Channel operates today without any native version of QCN, therefore when placing it on Ethernet you will not need to add functionality that wasn’t there to begin with, remember Ethernet is just a new layer 1-2 for native FC layers 2-4, the FC secret sauce remains unmodified.  Remember that not every standard defined by a standards body has to be adhered to by every device, some are required, some are optional.  Logical SANs are a great example of an optional standard.

Rather than discuss what is or isn’t required for multi-hop FCoE I’d like to ask a more important question that we as engineers tend to forget: Do I care?  This question is key because it avoids having us argue the technical merits of something we may never actually need, or may not have a need for today.

Do we care?

First let’s look at why we do multi-hop anything: to expand the port-count of our network.  Take TCP/IP networks and the internet for example, we require the ability to move packets across the globe through multiple routers (hops.)  This is in order to attach devices on all corners of the globe.

Now let’s look at what we do with FC today: typically one or two hop networks (sometimes three) used to connect several hundred devices (occasionally but rarely more.)  It’s actually quite common to find FC implementations with less than 100 attached ports.  This means that if you can hit the right port count without multiple hops you can remove complexity and decrease latency, in Storage Area Networks (SAN) we call this the collapsed core design.

The second thing to consider is a hypothetical question: If FCoE were permanently destined for single hop access/edge only deployments (it isn’t) should that actually stop you from using it?  The answer here is an emphatic no, I would still highly recommend FCoE as an access/edge architecture even if it were destined to connect back to an FC SAN and Ethernet LAN for all eternity.  Let’s jump to some diagrams to explain.  In the following diagrams I’m going to focus on Cisco architecture because as stated above they are currently the only vendor with a full FCoE product portfolio.

 image

In the above diagram you can see a fairly dynamic set  of FCoE connectivity options.  Nexus 5000 can be directly connected to servers, or to Nexus 4000 in IBM BladeCenter to pass FCoE.  It can also be connected to 10GE Nexus 2000s to increase its port density. 

To use the nexus 5000 + 2000 as an example it’s possible to create a single-hop (2000 isn’t an L2 hop it is an extension of the 5000) FCoE architecture of up to 384 ports with one point of switching management per fabric.  If you take server virtualization into the picture and assume 384 servers with a very modest V2P ratio of 10 virtual machines to 1 physical machine that brings you to 3840 servers connected to a single hop SAN.  That is major scalability with minimal management all without the need for multi-hop. The diagram above doesn’t include the Cisco UCS product portfolio which architecturally supports up to 320 FCoE connected servers/blades.

The next thing I’ve asked you to think about is whether or not you should implement FCoE in a hypothetical world where FCoE stays an access/edge architecture forever.  The answer would be yes.  In the following diagrams I outline the benefits of FCoE as an edge only architecture.

image

The first benefit is reducing the networks that are purchased, managed, power, and cooled from 3 to 1 (2 FC and 1 Eth to 1 FCoE.)  Even just at the access layer this is a large reduction in overhead and reduces the refresh points as I/O demands increase.

image The second benefit is the overall infrastructure reduction at the access layer.  Taking a typical VMware server as an example we reduce 6x 1GE ports, 2x 4GFC ports and the 8 cables required for them to 2x 10GE ports carrying FCoE.  This increases total bandwidth available while greatly reducing infrastructure.  Don’t forget the 4 top-of-rack switches (2x FC, 2x GE) reduced to 2 FCoE switches.

Since FCoE is fully compatible with both FC and pre-DCB Ethernet this requires 0 rip-and-replace of current infrastructure.  FCoE is instead used to build out new application environments or expand existing environments while minimizing infrastructure and complexity.

What if I need a larger FCoE environment?

If you require a larger environment than is currently supported extending your SAN is quite possible without multi-hop FCoE.  FCoE can be extended using existing FC infrastructure.  Remember customers that require an FCoE infrastructure this large already have an FC infrastructure to work with.

image 

What if I need to extend my SAN between data centers?

FCoE SAN extension is handled in the exact same way as FC SAN extension, CWDM, DWDM, Dark Fiber, or FCIP.  Remember we’re still moving Fibre Channel frames.

image

Summary:

FCoE multi-hop is not an argument that needs to be had for most current environments.  FCoE is a supplemental technology to current Fibre Channel implementations.  Multi-hop FCoE will be available by the end of CY2010 allowing 2+ tier FCoE networks with multiple switches in the path, but there is no need to wait for them to begin deploying FCoE.  The benefits of an FCoE deployment at the access layer only are significant, and many environments will be able to scale to full FCoE roll-outs without ever going mutli-hop. 

The Cloud Storage Argument

The argument over the right type of storage for data center applications is an ongoing battle.  This argument gets amplified when discussing cloud architectures both private and public.  Part of the reason for this disparity in thinking is that there is no ‘one size fits all solution.’  The other part of the problem is that there may not be a current right solution at all.

When we discuss modern enterprise data center storage options there are typically five major choices:

In a Windows server environment these will typically be coupled with Common internet File Service (CIFS) for file sharing.  Behind these protocols there are a series of storage arrays and disk types that be used to meet the applications I/O requirements.

As people move from traditional server architectures to virtualized servers, and from static physical silos to cloud based architectures they will typically move away from DAS into one of the other protocols listed above to gain the advantages, features and savings associated with shared storage.  For the purpose of this discussion we will focus on these four: FC, FCoE, iSCSI, NFS.

The issue then becomes which storage protocol to use for transport of your data from the server to the disk?  I’ve discussed the protocol differences in a previous post (http://www.definethecloud.net/?p=43) so I won’t go into the details here.  Depending on who you’re talking to it’s not uncommon to find extremely passionate opinions.  There a quite a few consultants and engineers that are hard coded to one protocol or another.  That being said most end-users just want something that works, performs adequately and isn’t a headache to manage.

Most environments currently work on a combination of these protocols, plenty of FC data centers rely on DAS to boot the operating system and NFS/CIFS for file sharing.  The same can be said for iSCSI.  With current options a combination of these protocols is probably always going to be best, iSCSI, FCoE, and NFS/CIFS can be used side by side to provide the right performance at the right price on an application by application basis.

The one definite fact in all of the opinions is that running separate parallel networks as we do today  with FC and Ethernet is not the way to move forward, it adds cost, complexity, management, power, cooling and infrastructure that isn’t needed.  Combining protocols down to one wire is key to the flexibility and cost savings promised by end-to-end virtualization and cloud architectures.  If that’s the case which wire do we choose, and which protocol rides directly on top to transport the rest?

10 Gigabit Ethernet is currently the industries push for a single wire and with good reason:

For the sake of argument let’s assume we all agree on 10GE as the right wire/protocol to carry all of our traffic, what do we layer on top?  FCoE, iSCSI, NFS, something else?  Well that is a tough question.  the first part of the answer is you don’t have to decide, this is very important because none of these protocols is mutually exclusive.  The second part of the answer is, maybe none of these is the end-all-be-all long-term solution.  Each current protocol has benefits and draw backs so let’s take a quick look:

And a quick look at comparative performance:

Protocol Performanceimage

While the above performance model is subjective and network tuning and specific equipment will play a big role the general idea holds sound.

One of the biggest factors that needs to be considered when choosing these protocols is block vs. file.  Some applications require direct block access to disk, many databases fall into this category.  As importantly if you want to boot an operating system from disk block level protocol (iSCSI, FCoE) are required.  This means that for most diskless configurations you’ll need to make a choice between FCoE and iSCSI (still within the assumption of consolidating on 10GE.)  Diskless configurations have major benefits in large scale deployments including power, cooling, administration, and flexibility so you should at least be considering them.

If you chosen a diskless configuration and settled on iSCSI or FCoE for your boot disks now you still need to figure out what to do about file shares?  CIFS or NFS are your next decision, CIFS is typically the choice for Windows, and NFS for Linux/UNIX environments.  Now you’ve wound up with 2-3 protocols running to get your storage settled and your stacking those alongside the rest of your typical LAN data.

Now to look at management step back and take a look at block data as a whole.  If you’re using enterprise class storage you’ve got several steps of management to configure the disk in that array.  It varies with vendor but typically something to the effect of:

  1. Configure the RAID for groups of disks
  2. Pool multiple RAID groups
  3. Logically sub divide the pool
  4. Assign the logical disks to the initiators/servers
  5. Configure required network security (FC zoning/ IP security/ACL, etc)

While this is easy stuff for storage and SAN administrators it’s time consuming, especially when you start talking about cloud infrastructures with lots and lots of moves adds and changes.  It becomes way to cumbersome to scale into petabytes with hundreds or thousands of customers.  NFS has more streamlined management but it can’t be used to boot an OS.  This makes for extremely tough decisions when looking to scale into large virtualized data center architectures or cloud infrastructure.

There is a current option that allows you to consolidate on 10GE, reduce storage protocols and still get diskless servers.  I
t’s definitely not the solution for every use case (there isn’t one), and it’s only a great option because there aren’t a whole lot of other great options.

In a fully virtualized environment NFS is a great low management overhead protocol for Virtual Machine disks.  Because it can’t boot we need another way to get the operating system to server memory.  That’s where PXE Boot comes in.  Pre eXecutionEnvironment (PXE) is a network OS boot that works well for small operating systems, typically terminal clients or Linux images.  It allows for a single instance of the operating system to be stored on a PXE server attached to the network, and a diskless server to retrieve that OS at boot time.  Because some virtualization operating systems (Hypervisors) are light weight, they are great candidates for PXE boot.  This allows the architecture below.

PXE/NFS 100% Virtualized Environment

image

Summary:

While there are  several options for data center storage none of them solves every need.  Current options increase in complexity and management as the scale of the implementation increases.  Looking to the future we need to be looking for better ways to handle storage.  Maybe block based storage has run it’s course, maybe SCSI has run it’s course, either way we need more scalable storage solutions available to the enterprise in order to meet the growing needs of the data center and maintain manageability and flexibility.  New deployments should take all current options into account and never write off the advantages of using more than one, or all of them where they fit.

FCoE initialization Protocol (FIP) Deep Dive

In an attempt to clarify my future posts I will begin categorizing a bit.  The following post will be part of a Technical Deep Dive series.

Fibre Channel over Ethernet (FCoE) is a protocol designed to move native Fibre Channel over 10 Gigabit Ethernet and above links, I’ve described the protocol in a previous post (http://www.definethecloud.net/?p=80.)  In order for FCoE to work we need a mechanism to carry the base Fibre Channel port / device login mechanisms over Ethernet.  These are the processes for a port to login and obtain a routable Fibre Channel Address.  Let’s start with some background and definitions:

DCB Data Center Bridging
FC Native Fibre Channel Protocol
FCF Fibre Channel Forwarder (an Ethernet switch capable of handling Encapsulation/De-encapsulation of FCoE frames and some or all FC services)
FCID Fibre Channel ID (24 Bit Routable address)
FCoE Fibre Channel over Ethernet
FC-MAP A 24-Bit value identifying an individual fabric
FIP FCoE Initialization Protocol
FLOGI FC Fabric Login
FPMA Fabric Provided MAC Address
PLOGI FC Port Login
PRLI Process Login
SAN Storage Area Network (switching infrastructure)
SCSI Small Computer Systems Interface
 
Now for the background, you’ll never grasp FIP properly if you don’t first get the fundamentals of FC:
 
N_Port Initialization
image

 

When a node comes online it’s port is considered an N_port.  When an N_port connects to the SAN it will connect to a switch port defined as a Fabric Port F_Port (this assumes your using a switched fabric.)  All N_ports operate the same way when they are brought online:

  1. FLOGI – Used to obtain a routable FCID for use in FC frame exchange.  The switch will provide the FCID during a FLOGI exchange.
  2. PLOGI – Used to register the N_Port with the FC name server

At this point a targets (disk or storage array) job is done, they can now sit and wait for requests.  An initiator (server) on the other hand needs to perform a few more tasks to discover available targets:

  1. Query – Request available targets from the FC name server, zoning will dictate which targets are available.
  2. PLOGI – A 2nd port Login, this time into the target port.
  3. PRLI – Process login to exchange supported upper layer protocols (ULP) typically SCSI-3.

Once this process has been completed the initiator can exchange frames with the target, i.e. the server can write to disk.

FIP:

The reason the FC login process is key to understanding FIP is that this is the process that FIP is handling for FCoE networks.  FIP allows an Ethernet attached FC node (Enode) to discover existing FCFs and supports the FC login procedure over 10+GE networks.  Rather than just providing an FCID, FIP will provide an FPMA which is a MAC address comprised of two parts: FC-MAP and FCID.

48 bit FCMAP (Mac Address)

image

FIP

image

So FIP provides an Ethernet MAC address used by FCoE to traverse the Ethernet network which contains the FCID required to be routed on the FC network.  FIP also passes the query and query response from the FC name server.  FIP uses a separate Ethertype from FCoE and its frames are standard Ethernet size (1518 Byte 802.1q frame) whereas FCoE frames are 2242 Byte Jumbo Frames.

FIP Snooping:

FIP snooping is used in multi-hop FCoE environments.  FIP snooping is a frame inspection method that can be used by FIP snooping capable DCB devices to monitor FIP frames and apply policies based on the information in those frames.  This allows for:

FIP Snooping

image

Summary:

FIP snooping uses dynamic Access Control Lists to enforce Fibre Channel rules within the DCB Ethernet network.  This prevents Enodes from seeing or communicating with other Enodes without first traversing an FCF.

Feedback, corrections, updates, questions?

Fibre Channel over Ethernet

Fibre Channel over Ethernet (FCoE) is a protocol standard ratified in June of 2009.  FCoE provides the tools for encapsulation of Fibre Channel (FC) in 10 Gigabit Ethernet frames.  The purpose of FCoE is to allow consolidation of low-latency, high performance FC networks onto 10GE infrastructures.  This allows for a single network/cable infrastructure which greatly reduces switch and cable count, lowering the power, cooling, and administrative requirements for server I/O.

FCoE is designed to be fully interoperable with current FC networks and require little to no additional training for storage and IP administrators. FCoE operates by encapsulating native FC into Ethernet frames.  Native FC is considered a 'lossless' protocol, meaning frames are not dropped during periods of congestion.  This is by design in order to ensure the behavior expected by the SCSI payloads.  Traditional Ethernet does not provide the tools for lossless delivery on shared networks so enhancements were defined by the IEEE to provide appropriate transport of encapsulated Fibre Channel on Ethernet networks.  These standards are known as Data Center Bridging (DCB) which I've discussed in a previous post (http://www.definethecloud.net/?p=31.)  These Ethernet enhancements are fully backward compatible with traditional Ethernet devices, meaning DCB capable devices can exchange standard Ethernet frames seamlessly with legacy devices.  The full 2148 Byte FC frame is encapsulated in an Ethernet jumbo frame avoiding any modification/fragmentation of the FC frame.

FCoE itself takes FC layers 2-4 and maps them to Ethernet layers 1-2, this replaces the FC-0 Physical layer, and FC-1 Encoding Layer.  This mapping between Ethernet and Fibre Channel is done through a Logical End-Point (LEP) which can by thought of as a translator between the two protocols.  The LEP is responsible for providing the appropriate encoding and physical access for frames traveling from FC nodes to Ethernet nodes and vice versa.  There are two devices that typically act as FCoE LEPs: Fibre Channel Forwarders (FCF) which are switches capable of both Ethernet and Fibre Channel, and Converged Network Adapters (CNA) which provide the server-side connection for a FCoE network.  Additionally the LEP operation can be done using a software initiator and traditional 10GE NICs but this places extra workload on the server processor rather than offloading it to adapter hardware.

One of the major advantages of replacing FC layers 0-1 when mapping onto 10GE is the encoding overhead.  8GB Fibre Channel uses an 8/10 bit encoding which adds 25% protocol overhead, 10GE uses a 64/64 bit encoding which has about 2% overhead, dramatically reducing the protocol overhead and increasing throughput.  The second major advantage is that FCoE maintains FC layers 2-4 which allows seamless integration with existing FC devices and maintains the Fibre Channel tool set such as zoning, LUN masking etc.  In order to provide FC login capabilities, multi-hop FCoE networks, and FC zoning enforcement on 10GE networks FCoE relies on another standard set known as Fibre Channel initialization Protocol (FIP) which I will discuss in a lter post.

Overall FCoE is one protocol to choose from when designing converged networks, or cable-once architectures.  The most important thing to remember is that a true cable-once architecture doesn't make you choose your Upper Layer Protocol (ULP) such as FCoE, only your underlying transport infrastructure.  If you choose 10GE the tools are now in place to layer any protocol of your choice on top, when and if you require it.

Thanks to my colleagues who recently provided a great discussion on protocol overhead and frame encoding...

Data Center Bridging Exchange

Data Center Bridging Exchange (DCBX) is one of the components of the DCB standards.  These standards offer enhancements to standard ethernet which are backwards compatible with traditional Ethernet and provide support for I/O Consolidation (http://www.definethecloud.net/?p=18.)  The three purposes of DCBX are:

Discovery of DCB capability:

The ability for DCB capable devices to discover and identify capabilities of DCB peers as well as identify non-DCB capable legacy devices.  You can find more information on DCB in a previous post (http://www.definethecloud.net/?p=31.)

Identification of misconfigured DCB features:

The ability to discover misconfiguration of features that require symmetric configuration between DCB peers.  Some DCB features are asymmetric meaning they can be configured differently on each end of a link, other features must match on both sides to be effective (symmetric.)  This functionality allows detection of configuration errors for these symmetric features.

Configuration of Peers:

A capability allowing DCBX to pass configuration information to a peer.  For instance a DCB capable switch can pass Priority Flow Control (PFC) information on to a Converged Network Adapter (CNA) to ensure FCoE traffic is appropriately tagged and pause is enabled for the chosen Class of Service (CoS) value.  This PFC exchange is a symmetric exchange and must match on both sides of the link.  DCB features such as Enhanced Transmission Selection (ETS) otherwise known as bandwidth management can be configured asymmetrically (different on each side of the link.)

DCBX relies on Link Level Discovery Protocol (LLDP) in order to pass this information and configuration.  LLDP is an industry standard version of Cisco Discovery Protocol (CDP) which allows devices to discover one another and exchange information about basic capabilities.  Because DCBX relies on LLDP and is an acknowledged protocol (2 way communication) any link which is intended to support DCBX must have LLDP enabled on both sides of the link for Tx/Rx.  When a port has LLDP disabled for either Rx or Tx DCBX is disabled on the port and DCBX Type Length Values (TLV) within received LLDP frames will be ignored.

DCBX capable devices should have DCBX enabled by default with the ability to administratively disable it.  This allows for more seamless deployments of DCB networks with less tendency for error.

Storage Protocols

Storage is a major consideration for cloud initiatives; what type of disk, which vendor, and as importantly which protocol?  Experts will tout one over the other based on cost, performance, throughput, etc.  Let's take a look at the major storage protocols at play in the data center:

Small Computer System Interface (SCSI):

SCSI is the dominant block level access method for disk in the data center.  Blocks are typically the smallest unit that can be read or written to on a disk, they exist in various sizes depending on disk type and usage.  Block level access means that the server can directly access the disk blocks without the need for a file system in place on top of them, this is opposite of file-based storage discussed later.

SCSI has been in use since the early 1980's and was originally used to move data within a single server.  The operating system handles writing data using the SCSI protocol to a SCSI drive controller which managed one or more devices on a SCSI cable within a system chassis.  The SCSI controller ensured that only one device would be active on the cable at any time which prevents contention on the SCSI bus.  Because SCSI was managed by a single controller and contained within a system the chance for data loss, or contention were minimal, this meant that SCSI did not require control mechanisms to handle data loss or contention as with networked protocols. SCSI itself is still widely used in its native format but it has also been encapsulated into other protocols for use within storage networks for consolidated storage.

Fibre Channel (FC):

Fibre Channel was designed to extend the functionality of SCSI into point-to-point, loop, and switched topologies.  This allows for longer distances as well as storage consolidation.  FC encapsulates SCSI data and Command Descriptor Blocks (CDB) into the payload of Fibre Channel frames.  Fibre Channel networks provided the addressing, routing, and flow-control required to support SCSI data.  Additionally Fibre Channel networks are designed to meet the needs of SCSI by providing 'lossless' in order delivery.  This means that in a stable network FC frames will not be dropped, and are delivered in order ensuring that the Upper Layer Protocols (ULP) will not be forced to reorder or resend frames.

Fibre Channel networks are typically carried over fiber-optic links on dedicated infrastructures.  These infrastructures are traditionally built-in pairs as exact mirrors of one another.  This provides complete physical redundancy end-to-end.  Additionally these networks provide high bandwidth and low-latency.  FC networks come in 1/2/4/8 Gbps speeds with 16/32 Gbps in the works.  Additionally 10Gbps FC links are typically available on a proprietary basis for links between switches.

internet/IP Small Computer System Interface (iSCSI):

iSCSI takes SCSI data and CDBs and places it in the payload of IP packets.  This allows the SCSI protocol to be extended across existing IP infrastructures.  While IP is routable within the data center and across the WAN iSCSI is not traditionally used/supported over routed boundaries (exceptions do exist.)  The draw of iSCSI has been that storage data can be extended across the existing infrastructure with minimal additional cost.

iSCSI has not gained the market share many have predicted over the years due to flaws in the protocol and limitations of the traditional Ethernet based data center networks.  until the standardization of 10 Gigabit Ethernet most data centers relied on 1GE links which were typically saturated already.  This meant implementing iSCSI required new switching infrastructure.  10GE has changed the bandwidth limits but still not catapulted iSCSI into the mainstream.  There are several reasons for this, one being that there is large existing investment in Fibre Channel, and two being the iSCSI protocol itself.

The problem with iSCSI from a protocol standpoint is that it takes the SCSI protocol which expects lossless, in-order delivery, and places it in TCP/IP packets which are designed to support heterogeneous WAN networks and experience packet loss and out-of-order delivery frequently.  This is done without providing any additional tools to either SCSI or TCP/IP for handling the SCSI payloads in the expected fashion.  This in no way means iSCSI is unusable or should be written off it just means that additional considerations must be made when designing iSCSI, especially in the Enterprise or larger environment.

In order to provide proper performance for iSCSI on shared networks Quality of Service (QoS), physical architecture, and jumbo frame support must be taken into account.  Because of these considerations many iSCSI networks have traditionally been placed on separate network hardware from the data center LAN (isolated iSCSI networks.)  This has minimized some of the benefits of consolidating on a single protocol.  With 10 Gigabit Ethernet and the standardization of Data Center Bridging (DCB) iSCSI looks more promising for a greater audience.  For more information on DCB see my previous post (http://www.definethecloud.net/?p=31.)

Fibre Channel over Ethernet (FCoE):

FCoE was ratified in 2009 and provides the functionality for moving native Fibre Channel across consolidated Ethernet networks.  FcoE relies on the DCB standards referenced above.  FCoE encapsulates full Fibre Channel frames inside Ethernet Jumbo Frame payloads.  Utilizing jumbo frames ensure that the FC frame is not fragmented or changed in any way.  The FCoE and DCB standards provide a robust tool set for consolidating existing Fibre Channel workloads on shared 10GE networks while providing the lossless, in-order delivery SCSI expects.  FCoE does not modify the existing Fibre Channel protocol suite and allows for the same management model including zoning, LUN masking, etc.  FCoE has started gaining ground over the last two years pushed by several large hardware vendors in the storage, network, and server markets.  For more information on FCoE see my post (http://www.definethecloud.net/?p=80.)

Common Internet File System (CIFS):

CIFS is a file based storage system based on Small Message block (SMB.)  This is a shared storage protocol typically used in Microsoft environments for file sharing.  Windows-based file shares rely on CIFS as the transfer protocol of the file level data.  File based storage relies on an underlying files system such as FAT32, XFS, NTFS or otherwise which differs from block based storage which does not.  File level storage is an excellent medium for some applications but is not traditionally effective in others.  When an application needs direct block access to disk file based storage is not appropriate.  Deployments that fall into this category include some databases and most Operating Systems.

Network File System (NFS):

NFS is another file based storage protocol.  NFS is traditionally used in Linux and Unix environments.  NFS is also a widely used protocol for VMware environments and can offer several benefits for virtual machine storage.  As a file based storage protocol NFS experiences many of the same limitations as stated for CIFS above.

Hyper Text Transfer Protocol (HTTP) and others:

When the cloud discussion leaves the data center (private/internal cloud) and moves up to the service provider level such as Google, Amazon, or the TelCos the protocols listed above may not have the necessary scalability.  When you begin talking about supporting thousands of customers with multiple Terabytes each, traditional storage protocols may not suffice.  It has to do with both the scalability of the systems and the administration of the disk.  iSCSI and FC both require a fair amount of management for the RAID, volumes, and LUNs, whereas CIFS and NFS require a fair amount for the security and volumes.  Protocols such as HTTP based storage are being used to simplify storage configuration and increase its scalability.

Which is the right protocol to use when moving to the cloud?  Obviously there is only one answer!  As always in IT 'it depends.'  Each protocol has it's uses, benefits and drawbacks.  The most important thing to remember is that most environments can benefit from more than one or all of these protocols.  Every application is different and any given protocol may have advantages for a particular app.  The only universal truth in cloud storage is that protocol flexibility will be key.

Data Center Bridging

Data Center Bridging (DCB) is a group of IEEE standard protocols designed to support I/O consolidation.  DCB enables multiple protocols with very different requirements to run over the same Layer 2 10 Gigabit Ethernet infrastructure.  Because DCB is currently discussed along with Fibre Channel over Ethernet (FCoE) it's not uncommon for people to think of them as part of FCoE.  This is not the case, while FCoE relies on DCB for proper treatment on a shared network, DCB enhancements can be applied to any protocol on the network.  DCB support is being built into data center hardware and software from multiple vendors and is fully backwards compatible with legacy systems (no forklift upgrades.)  For more information on FCoE see my post on the subject (http://www.definethecloud.net/?p=80.)

Network protocols typically have unique requirements in regards to latency, packet/frame loss, bandwidth, etc.  These differences have a large impact on the performance of the protocol in a shared environment.  Differences such as flow-control and frame loss are the reason Fibre Channel networks have traditionally been separate physical infrastructures from Ethernet networks.  DCB is the set of tools that allows us to converge these networks without sacrificing performance or reliability.

Lets take a look at the DCB suite:

Priority Flow Control (PFC) 802.1Qbb:

PFC is a flow control mechanism.  PFC is designed to eliminate frame loss for specific traffic types on Ethernet networks.  Protocols such as Small Computer System Interface (SCSI) which is used for block data storage are very sensitive to data loss.  SCSI protocol is the heart of Fibre Channel which is a tool used to extend SCSI from internal disk to centralized storage across a network.  In its native form on dedicated networks Fibre Channel has tools to ensure that frames are not lost as long as the network is stable.  In order to move Fibre Channel across Ethernet networks that same 'lossless' behavior must be guaranteed, PFC is the tool to do that.

PFC uses a pause mechanism to allow a receiving device to signal a pause to the directly connected sending device prior to buffer overflow and packet loss.  While Ethernet has had a tool to do this for some time (802.3x pause) it has always been at the link level.  This means that all traffic on the link would be paused, rather than just a selected traffic type.  Pausing a link carrying various I/O types would be a bad thing, especially for traffic such as IP Telephony and streaming video.  Rather than pause an entire link PFC sends a pause signal for a single Class of Service (CoS) which is part of an 802.1Q Ethernet header.  This allows up to 8 classes to be defined and paused independent of one another.

Congestion Management (802.1Qau):

When we begin pausing traffic in a network we have the potential to spread network congestion by causing choke points.  Imagine trying to drive past a football stadium (football or American football pick your flavor) when the game is about to start.  You're stuck in dead lock traffic even though you're not going to the game, if you've got that image your on the right track.  Congestion management is a set of signaling tools used to push that congestion out of the network core to the network edge (if you're thinking old school FECN and BECN you're not far off.)

Bandwidth Management (802.1Qaz):

Bandwidth management is a tool for simple consistent application of bandwidth controls at Layer 2 on a DCB network.  Bandwidth management allows specific traffic type to be guaranteed a percentage of available bandwidth based on its CoS.  For instance on a 10GE network access port utilizing FCoE you could guarantee 40% of the bandwidth to FCoE.  This provides a 4Gb tunnel for FCoE when needed but allows other traffic types to utilize that bandwidth when not in use for FCoE.

Data Center bridging Exchange (DCBX):

DCBX is a Layer 2 communication protocol that allows DCB capable devices to communicate and discover the edge of the DCB network, i.e. legacy devices.  DCBX not only allows passing of information but provides tools for passing configuration.  This is key to the consistent configuration of DCB networks.  For instance a DCB switch acting as a Fibre Channel over Ethernet Forwarder (FCF) can let an attached Converged Network Adapter (CNA) on a server know to tag FCoE frames with a specific CoS and enable pause for that traffic type.

All in all the DCB features are key enablers for true consolidated I/O.  They provide a tool set for each traffic type to be handled properly independent of other protocols on the wire.  For more information on Consolidated I/O see my previous post Consolidated IO (http://www.definethecloud.net/?p=67.)

Consolidated I/O

Consolidated I/O (input/output) is a hot topic and has been for the last two years, but it's not a new concept.  We've already consolidated I/O once in the data center and forgotten about it, remember those phone PBXs before we replaced them with IP Telephony?  The next step in consolidating I/O comes in the form of getting management traffic, backup traffic and storage traffic from centralized storage arrays to the servers on the same network that carries our IP data.  In the most general terms the concept is 'one wire.'  'Cable Once' or 'One Wire' allows a flexible I/O infrastructure with a greatly reduced cable count and a single network to power, cool and administer.

Solutions have existed and been used for years to do this, iSCSI (SCSI storage data over IP networks) is one tool that has been commonly used to do this.  The reason the topic has hit the mainstream over the last 2 years is that 10GB Ethernet was ratified and we now have a common protocol with the proper bandwidth to support this type of consolidation.  Prior to 10GE we simply didn't have the right bandwidth to effectively put everything down the same pipe.

The first thing to remember when discussing I/O consolidation is that contrary to popular belief I/O consolidation does not mean Fibre Channel over Ethernet (FCoE.)  I/O consolidation is all about using a single infrastructure and underlying protocol to carry any and all traffic types required in the data center.  The underlying protocol of choice is 10G Ethernet because it's lightweight, high bandwidth and Ethernet itself is the most widely used data center protocol today.  Using 10GE and the IEEE standards for Data Center bridging (DCB) as the underlying data center network, any and all protocols can be layered on top as needed on a per application basis.  See my post on DCB for more information (http://www.definethecloud.net/?p=31.)These protocols can be FCoE, iSCSI, UDP, TCP, NFS, CIFS, etc. or any combination of them all.

If you look at the data center today most are already using a combination of these protocols, but typically have 2 or more separate infrastructures to support them.  A data center that uses Fibre Channel heavily has two Fibre Channel networks (for redundancy) and one or more LAN networks. These 'Fibre Channel shops' are typically still using additional storage protocols such as NFS/CIFS for file based storage.  The cost of administering, powering, cooling, and eventually upgrading/refreshing these separate networks continues to grow.

Consolidating onto a single infrastructure not only provides obvious cost benefits but also provides the flexibility required for a cloud infrastructure.  Having a 'Cable Once' infrastructure allows you to provide the right protocol at the right time on an application basis, without the need for hardware changes.

Call it what you will I/O Consolidation, Network Convergence, or Network Virtualization, a cable once topology that can support the right protocol at the right time is one of the pillars of cloud architectures in the data center.