NVGRE

The most viable competitor to VXLAN is NVGRE which was proposed by Microsoft, Intel, HP and Dell. It is another encapsulation technique intended to allow virtual network overlays across the physical network. Both techniques also remove the scalability issues with VLANs which are bound at a max of 4096. NVGRE uses Generic Routing Encapsulation (GRE) as the encapsulation method. It uses the lower 24 bits of the GRE header to represent the Tenant Network Identifier (TNI.) Like VXLAN this 24 bit space allows for 16 million virtual networks.

While NVGRE provides optional support for broadcast via IP multi-cast, it does not rely on it for address learning as VXLAN does. It instead leaves that up to an as of yet undefined control plane protocol. This control plane protocol will handle the mappings between the â€œproviderâ€ address used in the outer header to designate the remote NVGRE end-point and the â€œcustomerâ€ address of the destination. The lack of reliance of flood and learn behavior replicated over IP multicast potentially makes NVGRE a more scalable solution. This will be dependent on implementation and underlying hardware.

Another difference between VXLAN and NVGRE will be within its multi-pathing capabilities. In its current format NVGRE will provides little ability to be properly load-balanced by ECMP. In order to enhance load-balancing the draft suggests the use of multiple IP addresses per NVGRE host, which will allow for more flows. This is a common issue with tunneling mechanisms and is solved in VXLAN by using a hash of the inner frame as the UDP source port. This provides for efficient load balancing by devices capable of 5-tuple balancing decisions. There are other possible solutions proposed for NVGRE load-balancing, weâ€™ll have to wait and see how they pan out.

The last major difference between the two protocols is the use of jumbo frames. VXLAN is intended to stay within a data center where jumbo frame support is nearly ubiquitous, therefore it assumes that support is present and utilizes it. NVGRE is intended to be able to be used inter-data-enter and therefore allows for provisions to avoid fragmentation.

Summary:

While NVGRE still needs much clarification it is backed by some of the biggest companies in IT and has some potential benefits. With the VXLAN capable hardware world expanding quickly you can expect to see more support for NVGRE. Layer 3 encapsulation techniques as a whole solve the issues of scalability inherent with bridging. Additionally due to their routed nature they also provide for loop free multi-pathed environments without the need for techniques such as TRILL and technologies based on it. In order to reach the scale and performance required by tomorrows data centers our networks need change, overlays such as these are one tool towards that goal.

Stateless Transport Tunneling (STT)

STT is another tunneling protocol along the lines of the VXLAN and NVGRE proposals. As with both of those the intent of STT is to provide a network overlay, or virtual network running on top of a physical network. STT was proposed by Nicira and is therefore not surprisingly written from a software centric view rather than other proposals written from a network centric view. The main advantage of the STT proposal is itâ€™s ability to be implemented in a software switch while still benefitting from NIC hardware acceleration. The other advantage of STT is its use of a 64 bit network ID rather than the 32 bit IDs used by NVGRE and VXLAN.

The hardware offload STT grants relieves the server CPU of a significant workload in high bandwidth systems (10G+.) This separates it from itâ€™s peers that use an IP encapsulation in the soft switch which negate the NICâ€™s LSO and LRO functions. The way STT goes about this is by having the software switch inserts header information into the packet to make it look like a TCP packet, as well as the required network virtualization features. This allows the guest OS to send frames up to 64k to the hypervisor which are encapsulated and sent to the NIC for segmentation. While this does allow for the HW offload to be utilized it causes several network issues due to itâ€™s use of valid TCP headers it causes issues for many network appliances or â€œmiddle boxes.â€

STT is not expected to be ratified and is considered by some to have been proposed for informational purposes, rather than with the end goal of a ratified standard. With its misuse of a valid TCP header it would be hard pressed for ratification. STT does bring up the interesting issue of hardware offload. The IP tunneling protocols mentioned above create extra overhead on host CPUs due to their inability to benefit from NIC acceleration techniques. VXLAN and NVGRE are intended to be implemented in hardware to solve this problem. Both VXLAN and NVGRE use a 32 bit network ID because they are intended to be implemented in hardware, this space provides for 16 million tenants. Hardware implementation is coming quickly in the case of VXLAN with vendors announcing VXLAN capable switches and NICs.

VXLAN Deep Dive – Part II

In part one of this post I covered the basic theory of operations and functionality of VXLAN (http://www.definethecloud.net/vxlan-deep-dive.) This post will dive deeper into how VXLAN operates on the network.

Letâ€™s start with the basic concept that VXLAN is an encapsulation technique. Basically the Ethernet frame sent by a VXLAN connected device is encapsulated in an IP/UDP packet. The most important thing here is that it can be carried by any IP capable device. The only time added intelligence is required in a device is at the network bridges known as VXLAN Tunnel End-Points (VTEP) which perform the encapsulation/de-encapsulation. This is not to say that benefit canâ€™t be gained by adding VXLAN functionality elsewhere, just that itâ€™s not required.

Providing Ethernet Functionality on IP Networks:

As discussed in Part 1, the source and destination IP addresses used for VXLAN are the Source VTEP and destination VTEP. This means that the VTEP must know the destination VTEP in order to encapsulate the frame. One method for this would be a centralized controller/database. That being said VXLAN is implemented in a decentralized fashion, not requiring a controller. There are advantages and drawbacks to this. While utilizing a centralized controller would provide methods for address learning and sharing, it would also potentially increase latency, require large software driven mapping tables and add network management points. We will dig deeper into the current decentralized VXLAN deployment model.

VXLAN maintains backward compatibility with traditional Ethernet and therefore must maintain some key Ethernet capabilities. One of these is flooding (broadcast) and â€˜Flood and Learn behavior.â€™ I cover some of this behavior here (http://www.definethecloud.net/data-center-101-local-area-network-switching) but the summary is that when a switch receives a frame for an unknown destination (MAC not in its table) it will flood the frame to all ports except the one on which it was received. Eventually the frame will get to the intended device and a reply will be sent by the device which will allow the switch to learn of the MACs location. When switches see source MACs that are not in their table they will â€˜learnâ€™ or add them.

VXLAN is encapsulating over IP and IP networks are typically designed for unicast traffic (one-to-one.) This means there is no inherent flood capability. In order to mimic flood and learn on an IP network VXLAN uses IP multi-cast. IP multi-cast provides a method for distributing a packet to a group. This IP multi-cast use can be a contentious point within VXLAN discussions because most networks arenâ€™t designed for IP multi-cast, IP multi-cast support can be limited, and multi-cast itself can be complex dependent on implementation.

Within VXLAN each VXLAN segment ID will be subscribed to a multi-cast group. Multiple VXLAN segments can subscribe to the same ID, this minimizes configuration but increases unneeded network traffic. When a device attaches to a VXLAN on a VTEP that was not previously in use, the VXLAN will join the IP multi-cast group assigned to that segment and start receiving messages.

In the diagram above we see the normal operation in which the destination MAC is known and the frame is encapsulated in IP using the source and destination VTEP address. The frame is encapsulated by the source VTEP, de-encapsulated at the destination VTEP and forwarded based on bridging rules from that point. In this operation only the destination VTEP will receive the frame (with the exception of any devices in the physical path, such as the core IP switch in this example.)

In the example above we see an unknown MAC address (the MAC to VTEP mapping does not exist in the table.) In this case the source VTEP encapsulates the original frame in an IP multi-cast packet with the destination IP of the associated multicast group. This frame will be delivered to all VTEPs participating in the group. VTEPs participating in the group will ideally only be VTEPs with connected devices attached to that VXLAN segment. Because multiple VXLAN segments can use the same IP multicast group this is not always the case. The VTEP with the connected device will de-encapsulate and forward normally, adding the mapping from the source VTEP if required. Any other VTEP that receives the packet can then learn the source VTEP/MAC mapping if required and discard it. This process will be the same for other traditionally flooded frames such as ARP, etc. The diagram below shows the logical topologies for both traffic types discussed.

As discussed in Part 1 VTEP functionality can be placed in a traditional Ethernet bridge. This is done by placing a logical VTEP construct within the bridge hardware/software. With this in place VXLANs can bridge between virtual and physical devices. This is necessary for physical server connectivity, as well as to add network services provided by physical appliances. Putting it all together the diagram below shows physical servers communicating with virtual servers in a VXLAN environment. The blue links are traditional IP links and the switch shown at the bottom is a standard L3 switch or router. All traffic on these links is encapsulated as IP/UDP and broken out by the VTEPs.

Summary:

VXLAN provides backward compatibility with traditional VLANs by mimicking broadcast and multicast behavior through IP multicast groups. This functionality provides for decentralized learning by the VTEPs and negates the need for a VXLAN controller.

VXLAN Deep Dive

Iâ€™ve been spending my free time digging into network virtualization and network overlays.Â This is part 1 of a 2 part series, part 2 can be found here: http://www.definethecloud.net/vxlan-deep-divepart-2.Â By far the most popular virtualization technique in the data center is VXLAN.Â This has as much to do with Cisco and VMware backing the technology as the tech itself.Â That being said VXLAN is targeted specifically at the data center and is one of many similar solutions such as: NVGRE and STT.)Â VXLANâ€™s goal is allowing dynamic large scale isolated virtual L2 networks to be created for virtualized and multi-tenant environments.Â It does this by encapsulating frames in VXLAN packets.Â The standard for VXLAN is under the scope of the IETF NVO3 working group.

The VXLAN encapsulation method is IP based and provides for a virtual L2 network.Â With VXLAN the full Ethernet Frame (with the exception of the Frame Check Sequence: FCS) is carried as the payload of a UDP packet.Â VXLAN utilizes a 24-bit VXLAN header, shown in the diagram, to identify virtual networks.Â This header provides for up to 16 million virtual L2 networks.

Frame encapsulation is done by an entity known as a VXLAN Tunnel Endpoint (VTEP.)Â A VTEP has two logical interfaces: an uplink and a downlink.Â The uplink is responsible for receiving VXLAN frames and acts as a tunnel endpoint with an IP address used for routing VXLAN encapsulated frames.Â These IP addresses are infrastructure addresses and are separate from the tenant IP addressing for the nodes using the VXLAN fabric.Â VTEP functionality can be implemented in software such as a virtual switch or in the form a physical switch.

VXLAN frames are sent to the IP address assigned to the destination VTEP; this IP is placed in the Outer IP DA.Â The IP of the VTEP sending the frame resides in the Outer IP SA.Â Packets received on the uplink are mapped from the VXLAN ID to a VLAN and the Ethernet frame payload is sent as an 802.1Q Ethernet frame on the downlink.Â During this process the inner MAC SA and VXLAN ID is learned in a local table.Â Packets received on the downlink are mapped to a VXLAN ID using the VLAN of the frame.Â A lookup is then performed within the VTEP L2 table using the VXLAN ID and destination MAC; this lookup provides the IP address of the destination VTEP.Â The frame is then encapsulated and sent out the uplink interface.

Using the diagram above for reference a frame entering the downlink on VLAN 100 with a destination MAC of 11:11:11:11:11:11 will be encapsulated in a VXLAN packet with an outer destination address of 10.1.1.1.Â The outer source address will be the IP of this VTEP (not shown) and the VXLAN ID will be 1001.

In a traditional L2 switch a behavior known as flood and learn is used for unknown destinations (i.e. a MAC not stored in the MAC table.Â This means that if there is a miss when looking up the MAC the frame is flooded out all ports except the one on which it was received.Â When a response is sent the MAC is then learned and written to the table.Â The next frame for the same MAC will not incur a miss because the table will reflect the port it exists on.Â VXLAN preserves this behavior over an IP network using IP multicast groups.

Each VXLAN ID has an assigned IP multicast group to use for traffic flooding (the same multicast group can be shared across VXLAN IDs.)Â When a frame is received on the downlink bound for an unknown destination it is encapsulated using the IP of the assigned multicast group as the Outer DA; itâ€™s then sent out the uplink.Â Any VTEP with nodes on that VXLAN ID will have joined the multicast group and therefore receive the frame.Â This maintains the traditional Ethernet flood and learn behavior.

VTEPs are designed to be implemented as a logical device on an L2 switch.Â The L2 switch connects to the VTEP via a logical 802.1Q VLAN trunk.Â This trunk contains an VXLAN infrastructure VLAN in addition to the production VLANs.Â The infrastructure VLAN is used to carry VXLAN encapsulated traffic to the VXLAN fabric.Â The only member interfaces of this VLAN will be VTEPâ€™s logical connection to the bridge itself and the uplink to the VXLAN fabric.Â This interface is the â€˜uplinkâ€™ described above, while the logical 802.1Q trunk is the downlink.

Summary

VXLAN is a network overlay technology design for data center networks.Â It provides massively increased scalability over VLAN IDs alone while allowing for L2 adjacency over L3 networks.Â The VXLAN VTEP can be implemented in both virtual and physical switches allowing the virtual network to map to physical resources and network services.Â VXLAN currently has both wide support and hardware adoption in switching ASICS and hardware NICs, as well as virtualization software.

Access Layer Network Virtualization: VN-Tag and VEPA

One of the highlights of my trip to lovely San Francisco for VMworld was getting to join Scott Lowe and Brad Hedlund for an off the cuff whiteboard session. I use the term join loosely because I contributed nothing other than a set of ears. We discussed a few things, all revolving around virtualization (imagine that at VMworld.) One of the things we discussed was virtual switching and Scott mentioned a total lack of good documentation on VEPA, VN-tag and the differences between the two. Iâ€™ve also found this to be true, the documentation that is readily available is:

Marketing fluff
Vendor FUD
Standards body documents which might as well be written in a Klingon/Hieroglyphics slang manifestation

This blog is my attempt to demystify VEPA and VN-tag and place them both alongside their applicable standards, and by that I mean contribute to the extensive garbage info revolving around them both. Before we get into them both weâ€™ll need to understand some history and the problems they are trying to solve.

First letâ€™s get physical. Looking at a traditional physical access layer we have two traditional options for LAN connectivity: Top-of-Rack (ToR) and End-of-Row (EoR) switching topologies. Both have advantages and disadvantages.

EoR:

EoR topologies rely on larger switches placed on the end of each row for server connectivity.

Pros:

Less Management points
Smaller Spanning-Tree Protocol (STP) domain
Less equipment to purchase, power and cool

Cons:

More above/below rack cable runs
More difficult cable modification, troubleshooting and replacement
More expensive cabling

ToR:

ToR utilizes a switch at the top of each rack (or close to it.)

Pros:

Less cabling distance/complexity
Lower cabling costs
Faster move/add/change for server connectivity

Cons:

Larger STP domain
More management points
More switches to purchase, power and cool

Now letâ€™s virtualize. In a virtual server environment the most common way to provide Virtual Machine (VM) switching connectivity is a Virtual Ethernet Bridge (VEB) in VMware we call this a vSwitch. A VEB is basically software that acts similar to a Layer 2 hardware switch providing inbound/outbound and inter-VM communication. A VEB works well to aggregate multiple VMs traffic across a set of links as well as provide frame delivery between VMs based on MAC address. Where a VEB is lacking is network management, monitoring and security. Typically a VEB is invisible and not configurable from the network teams perspective. Additionally any traffic handled by the VEB internally cannot be monitored or secured by the network team.

Pros:

Local switching within a host (physical server)
- Less network traffic
- Possibly faster switching speeds
Common well understood deployment
Implemented in software within the hypervisor with no external hardware requirements

Cons:

Typically configured and managed within the virtualization tools by the server team
Lacks monitoring and security tools commonly used within the physical access layer
Creates a separate management/policy model for VMs and physical servers

These are the two issues that VEPA and VN-tag look to address in some way. Now letâ€™s look at the two individually and what they try and solve.

Virtual Ethernet Port Aggregator (VEPA):

VEPA is standard being lead by HP for providing consistent network control and monitoring for Virtual Machines (of any type.) VEPA has been used by the IEEE as the basis for 802.1Qbg â€˜Edge Virtual Bridging.â€™ VEPA comes in two major forms: a standard mode which requires minor software updates to the VEB functionality as well as upstream switch firmware updates, and a multi-channel mode which will require additional intelligence on the upstream switch.

Standard Mode:

The beauty of VEPA in itâ€™s standard mode is in itâ€™s simplicity, if youâ€™ve worked with me you know I hate complex designs and systems, they just lead to problems. In the standard mode the software upgrade to the VEB in the hypervisor simply forces each VM frame out to the external switch regardless of destination. This causes no change for destination MAC addresses external to the host, but for destinations within the host (another VM in the same VLAN) it forces that traffic to the upstream switch which forwards it back instead of handling it internally, called a hairpin turn.) Itâ€™s this hairpin turn that causes the requirement for the upstream switch to have updated firmware, typical STP behavior prevents a switch from forwarding a frame back down the port it was received on (like the saying goes, donâ€™t egress where you ingress.) The firmware update allows the negotiation between the physical host and the upstream switch of a VEPA port which then allows this hairpin turn. Letâ€™s step through some diagrams to visualize this.

Again the beauty of this VEPA mode is in its simplicity. VEPA simply forces VM traffic to be handled by an external switch. This allows each VM frame flow to be monitored managed and secured with all of the tools available to the physical switch. This does not provide any type of individual tunnel for the VM, or a configurable switchport but does allow for things like flow statistic gathering, ACL enforcement, etc. Basically weâ€™re just pushing the MAC forwarding decision to the physical switch and allowing that switch to perform whatever functions it has available on each transaction. The drawback here is that we are now performing one ingress and egress for each frame that was previously handled internally. This means that there are bandwidth and latency considerations to be made. Functions like Single Root I/O Virtualization (SR/IOV) and Direct Path I/O can alleviate some of the latency issues when implementing this. Like any technology there are typically trade offs that must be weighed. In this case the added control and functionality should outweigh the bandwidth and latency additions.

Multi-Channel VEPA:

Multi-Channel VEPA is an optional enhancement to VEPA that also comes with additional requirements. Multi-Channel VEPA allows a single Ethernet connection (switchport/NIC port) to be divided into multiple independent channels or tunnels. Each channel or tunnel acts as an unique connection to the network. Within the virtual host these channels or tunnels can be assigned to a VM, a VEB, or to a VEB operating with standard VEPA. In order to achieve this goal Multi-Channel VEPA utilizes a tagging mechanism commonly known as Q-in-Q (defined in 802.1ad) which uses a service tag â€˜S-Tagâ€™ in addition to the standard 802.1q VLAN tag. This provides the tunneling within a single pipe without effecting the 802.1q VLAN. This method requires Q-in-Q capability within both the NICs and upstream switches which may require hardware changes.

VN-Tag:

The VN-Tag standard was proposed by Cisco and others as a potential solution to both of the problems discussed above: network awareness and control of VMs, and access layer extension without extending management and STP domains. VN-Tag is the basis of 802.1qbh â€˜Bridge Port Extension.â€™ Using VN-Tag an additional header is added into the Ethernet frame which allows individual identification for virtual interfaces (VIF.)

The tag contents perform the following functions:

Ethertype	Identifies the VN tag
D	Direction, 1 indicates that the frame is traveling from the bridge to the interface virtualizer (IV.)
P	Pointer, 1 indicates that a vif_list_id is included in the tag.
vif_list_id	A list of downlink ports to which this frame is to be forwarded (replicated). (multicast/broadcast operation)
Dvif_id	Destination vif_id of the port to which this frame is to be forwarded.
L	Looped, 1 indicates that this is a multicast frame that was forwarded out the bridge port on which it was received. In this case, the IV must check the Svif_id and filter the frame from the corresponding port.
R	Reserved
VER	Version of the tag
SVIF_ID	The vif_id of the source of the frame

The most important components of the tag are the source and destination VIF IDs which allow a VN-Tag aware device to identify multiple individual virtual interfaces on a single physical port.

VN-Tag can be used to uniquely identify and provide frame forwarding for any type of virtual interface (VIF.) A VIF is any individual interface that should be treated independently on the network but shares a physical port with other interfaces. Using a VN-Tag capable NIC or software driver these interfaces could potentially be individual virtual servers. These interfaces can also be virtualized interfaces on an I/O card (i.e. 10 virtual 10G ports on a single 10G NIC), or a switch/bridge extension device that aggregates multiple physical interfaces onto a set of uplinks and relies on an upstream VN-tag aware device for management and switching.

Because of VN-tags versatility itâ€™s possible to utilize it for both bridge extension and virtual networking awareness. It also has the advantage of allowing for individual configuration of each virtual interface as if it were a physical port. The disadvantage of VN-Tag is that because it utilizes additions to the Ethernet frame the hardware itself must typically be modified to work with it. VN-tag aware switch devices are still fully compatible with traditional Ethernet switching devices because the VN-tag is only used within the local system. For instance in the diagram above VN-tags would be used between the VN-tag aware switch at the top of the diagram to the VIF but the VN-tag aware switch could be attached to any standard Ethernet switch. VN-tags would be written on ingress to the VN-tag aware switch for frames destined for a VIF, and VN-tags would be stripped on egress for frames destined for the traditional network.

Where does that leave us?

We are still very early in the standards process for both 802.1qbh and 802.1Qbg, and things are subject to change. From what it looks like right now the standards body will be utilizing VEPA as the basis for providing physical type network controls to virtual machines, and VN-tag to provide bridge extension. Because of the way in which each is handled they will be compatible with one another, meaning a VN-tag based bridge extender would be able to support VEPA aware hypervisor switches.

Equally as important is what this means for today and todayâ€™s hardware. There is plenty of Fear Uncertainty and Doubt (FUD) material out there intended to prevent product purchase because the standards process isnâ€™t completed. The question becomes whatâ€™s true and what isnâ€™t, letâ€™s take care of the answers FAQ style:

Will I need new hardware to utilize VEPA for VM networking?

No, for standard VEPA mode only a software change will be required on the switch and within the Hypervisor. For Multi-Channel VEPA you may require new hardware as it utilizes Q-in-Q tagging which is not typically an access layer switch feature.

Will I need new hardware to utilize VN-Tag for bridge extension?

Yes, VN-tag bridge extension will typically be implemented in hardware so you will require a VN-tag aware switch as well as VN-tag based port extenders.

Will hardware I buy today support the standards?

That question really depends on how much change occurs with the standards before finalization and which tool your looking to use:

Standard VEPA â€“ Yes
Multi-Channel VEPA â€“ Possibly (if Q-in-Q is supported)
VN-Tag â€“ possibly

Are there products available today that use VEPA or VN-Tag?

Yes Cisco has several products that utilize VN-Tag: Virtual interface Card (VIC), Nexus 2000, and the UCS I/O Module (IOM.) Additionally HPâ€™s FlexConnect technology is the basis for multi-channel VEPA.

Summary:

VEPA and VN-tag both look to address common access layer network concerns and both are well on their way to standardization. VEPA looks to be the chosen method for VM aware networking and VN-Tag for bridge extension. Devices purchased today that rely on pre-standards versions of either protocol should maintain compatibility with the standards as they progress but itâ€™s not guaranteed. That being said standards are not required for operation and effectiveness, and most start as unique features which are then submitted to a standards body.

Fibre Channel over Ethernet

Fibre Channel over Ethernet (FCoE) is a protocol standard ratified in June of 2009. FCoE provides the tools for encapsulation of Fibre Channel (FC) in 10 Gigabit Ethernet frames. The purpose of FCoE is to allow consolidation of low-latency, high performance FC networks onto 10GE infrastructures. This allows for a single network/cable infrastructure which greatly reduces switch and cable count, lowering the power, cooling, and administrative requirements for server I/O.

FCoE is designed to be fully interoperable with current FC networks and require little to no additional training for storage and IP administrators. FCoE operates by encapsulating native FC into Ethernet frames. Native FC is considered a 'lossless' protocol, meaning frames are not dropped during periods of congestion. This is by design in order to ensure the behavior expected by the SCSI payloads. Traditional Ethernet does not provide the tools for lossless delivery on shared networks so enhancements were defined by the IEEE to provide appropriate transport of encapsulated Fibre Channel on Ethernet networks. These standards are known as Data Center Bridging (DCB) which I've discussed in a previous post (http://www.definethecloud.net/?p=31.) These Ethernet enhancements are fully backward compatible with traditional Ethernet devices, meaning DCB capable devices can exchange standard Ethernet frames seamlessly with legacy devices. The full 2148 Byte FC frame is encapsulated in an Ethernet jumbo frame avoiding any modification/fragmentation of the FC frame.

FCoE itself takes FC layers 2-4 and maps them to Ethernet layers 1-2, this replaces the FC-0 Physical layer, and FC-1 Encoding Layer. This mapping between Ethernet and Fibre Channel is done through a Logical End-Point (LEP) which can by thought of as a translator between the two protocols. The LEP is responsible for providing the appropriate encoding and physical access for frames traveling from FC nodes to Ethernet nodes and vice versa. There are two devices that typically act as FCoE LEPs: Fibre Channel Forwarders (FCF) which are switches capable of both Ethernet and Fibre Channel, and Converged Network Adapters (CNA) which provide the server-side connection for a FCoE network. Additionally the LEP operation can be done using a software initiator and traditional 10GE NICs but this places extra workload on the server processor rather than offloading it to adapter hardware.

One of the major advantages of replacing FC layers 0-1 when mapping onto 10GE is the encoding overhead. 8GB Fibre Channel uses an 8/10 bit encoding which adds 25% protocol overhead, 10GE uses a 64/64 bit encoding which has about 2% overhead, dramatically reducing the protocol overhead and increasing throughput. The second major advantage is that FCoE maintains FC layers 2-4 which allows seamless integration with existing FC devices and maintains the Fibre Channel tool set such as zoning, LUN masking etc. In order to provide FC login capabilities, multi-hop FCoE networks, and FC zoning enforcement on 10GE networks FCoE relies on another standard set known as Fibre Channel initialization Protocol (FIP) which I will discuss in a lter post.

Overall FCoE is one protocol to choose from when designing converged networks, or cable-once architectures. The most important thing to remember is that a true cable-once architecture doesn't make you choose your Upper Layer Protocol (ULP) such as FCoE, only your underlying transport infrastructure. If you choose 10GE the tools are now in place to layer any protocol of your choice on top, when and if you require it.

Thanks to my colleagues who recently provided a great discussion on protocol overhead and frame encoding...