Data Center 101: Local Area Network Switching

Interestingly enough 2 years ago I couldn’t even begin to post an intelligent blog on Local Area Networking 101, funny how things change.  That being said I make no guarantees that this post will be intelligent in any way.  Without further ado let’s get into the second part of the Data Center 101 series and discuss the LAN.

I find the best way to understand a technology is to have a grasp on its history and the problems it solves, so let’s take a minute to dive into the history of the LAN.  For the sake of simplicity and real world applicability I’m going to stick to Ethernet as it is the predominant LAN technology in today’s data center environments.  Before we even go into the history we’ll define Ethernet and where it fits on the OSI model.

Ethernet:

Ethernet is a frame based networking technology which is comprised as a set of standards for Layer 1 and 2 of the OSI model.  Ethernet devices use a address called a Media-Access Control Address (MAC) for communication.  MAC addresses are a flat address space which is not routable (can only be used on a flat layer 2 network) and is composed of several components most importantly a vendor ID known as an Organizational Unique Identifier (OUI) and a unique address for the individual port.

OSI Model:

The Open-Systems Interconnection (OSI) model is a sub-division of the components of communication that is used as a tool to create interoperable network systems and is a fantastic model for learning networks.  The OSI model breaks into 7-Layers much like my favorite taco dip.

image Understanding the OSI model and where protocols and hardware fit into it will not only help you learn but also help with understanding new technologies and how they fit together.   I often revert back to placing concepts in terms of the OSI model when having highly technical discussions about new concepts and technology.  The beauty of the model is that it allows for easy interoperability and flexibility.  For instance Ethernet is still Ethernet whether you use Fiber cables or copper cables because only Layer 1 is changing.

Ethernet LAN History:

As the LAN networks we use today evolved they typically started with individual groups within an organization.  For instance a particular group would have a requirement for a database server and would purchase a device to connect that group.  Those devices were commonly a hub.

Hub:

A network hub is a device with multiple ports used to connect several devices for the purposes of network communication.  When an Ethernet hub receives a frame it replicates it to all connected ports except the one it received it on in a process called flooding.  All connected devices receive a copy of the frame and will typically only process the frame if the destination MAC address is their own (there are exceptions to this which are beyond the scope of this discussion.)

image

In the diagram above you see a single device sending a frame and that frame being flooded to all other active ports.  This works quite well for small networks consisting of a single hub and low port count, but you can easily see where problems start to arise as the network grows.

image

Once multiple hubs are connected and the network grows each hub will flood every frame, and all devices will receive these frames regardless of whether they are the intended recipient.  This causes major overhead in the network due to the unneeded frames consuming bandwidth.

Bridge:

The next step in the network evolution is called bridging and was designed to alleviate this problem and decrease the overhead of forwarding unneeded frames.  A bridge is a device that makes an intelligent decision on when and where to flood frames based on MAC addresses stored in a table.  These MAC addresses can be static (manually input) or dynamic (learned on the fly.)  Because it is more common we will focus on dynamic.  The original bridges were typically 2 or more ports (low port counts) and could separate MAC addresses using the table for those ports.

image

In the above diagram you see a hub operating normally on the left flooding the frame to all active ports.  When the frame is received by the bridge a MAC address lookup is done on the MAC table and the bridge makes a decision whether or not to flood to the other side of the network.  Because the frame in this example is destined for a MAC address existing on the left side of the network the bridge does not flood the frame.  These addresses will be learned dynamically as devices send frames.  If the destination MAC address had been a device on the right side of the network the bridge would have sent the frame to that side to be flooded by the hub.

Bridges reduced unnecessary network traffic between groups or departments while allowing resource sharing when needed.  The limitation of original bridges came from the low port counts and changing data patterns.  Because the bridges were typically only separating 2-4 networks there was still quite a bit of flooding, especially when more and more resources were shared across groups.

Switches:

Switches are the next evolution of bridges and the operation they perform is still considered bridging.  In very basic terms a switch is a high port-count bridge that is able to make decisions on a port-by-port basis.  A switch maintains a MAC table and only forwards frames to the appropriate port based on the destination MAC.  If the switch has not yet learned the destination MAC it will flood the frame.  Switches and bridges will also flood multi-cast (traffic destined for multiple recipients) and broadcast (traffic destined for all recipients) frames which are beyond the scope of this discussion.

image

In the diagram above I have added several components to clarify switching operations now that we are familiar with basic bridging. Starting in the top left of the diagram you see some of the information that is contained in the header of an Ethernet frame.  In this case it is the source and destination MAC addresses of two of the devices connected to the switch.  Each end-point in the above diagram is labeled with a MAC address starting with AF:AF:AF:AF:AF.  In the top right we see a representation of a MAC table which is stored on the switch and learned dynamically.  The MAC table contains a listing of which MAC addresses are known to be on each port.  Because the MAC table in this example is fully populated we can assume that the switch has previously seen a frame from each device in order to populate the table.  That auto population is the ‘dynamic learning’ and it is done be recording the source MAC address of incoming frames.  Lastly we see that the frame being sent by the device on port 1 is only being forwarded to the device on port 2.  In the event port 2’s MAC address had not yet been learned the switch would be forced to flood the frame to all ports except the one it received it on in order to ensure it was received by the destination device.

So far we’ve learned that bridges improved upon hubs, and switches improved upon basic bridging.  The next kink in the evolution of Ethernet LANs came as our networks grew beyond single switches and we began adding in redundancy.

The three issues that arose can all be grouped as problems with network loops (specifically Layer 2 Ethernet loops.)  These issues are:

Multiple Frame Copies:

When a device receives the same frame more than once due to replication or loop issues it is a multiple frame copy.  This can cause issues for some hardware and software and also consumes additional unnecessary bandwidth.

MAC Address Instability:

When a switch must repeatedly change its MAC table entry for a given device this is considered MAC address instability.

Broadcast Storms:

Broadcast storms are the most serious of the three issues as they can literally bring all traffic to a halt.  If you ask someone who has been doing networking for quite some time how they troubleshoot a broadcast storm you are quite likely to hear ‘Unplug everything and plug things back in one at a time until you find the offending device.’  The reason for this is that in the past the storm itself would soak up all available bandwidth leaving no means to access switching equipment in order to troubleshoot the issue.  Most major vendors now provide protection against this level of problem but storms are still a serious problem that can have a major performance impact on production data.  Broadcast storms are caused when a broadcast, multi-cast or flooded frame is repeatedly forwarded and replicated by one or more switches.

 image In the diagram above we can see a switched loop.  We can also observe several stages of frame forwarding starting with the device 1 in the top left sending a frame to the device 2 in the top right.

  1. Device 1 forwards a frame to device 2.  This one-to-one communication is known as unicast.
  2. The switch on the top left does not yet have device 2 in its MAC table therefore it is forced to flood the frame, meaning replicate the frame to all ports except the one where it was received. 
  3. In stage three we see two separate things occur:
    1. The switch in the top right delivers the frame to the intended device (for simplicities sake we are assuming the switch in the top right already has a MAC table entry for the device.)
    2. The bottom switch having received the frame forwards the frame to the switch in the top right.
  4. The switch in the top right receives the second copy and forwards it based on MAC table delivering the second copy of the same frame to device 2.

 

image

The above example has a little more going on and can become confusing quickly.  For the purposes of this example assume all three switches have blank MAC address tables with no devices known.  Also remember that they are building the MAC table dynamically based on the source MAC address they see in a frame.  To aid in understanding I will fill out the MAC tables at each step.

1. Our first stage is the easy one.  Device 1 forwards a unicast frame to device 2.  Switch A receives this frame on the top port.

image

2. When switch A receives the frame it checks its MAC table for the correct port to forward frames to device 2.  Because its MAC table is currently blank it must flood the frame (replicate it to all ports except the one where it was received.)  As it floods the frame it also records the MAC address and attached port of device 1 because it has seen this MAC as the source in the frame.

image

3. In stage 3 two switches receive the frame and must make decisions. 

  1. Switch C having a blank MAC table must flood the frame.  Because there is only one port other than the one it received it on switch C floods the frame to the only available port, at the same time it records the source MAC address as having been received on its port 1.  
  2. Switch B also receives the frame from switch A, and must make a decision.  Like switch C, switch B has no records in its MAC table and must flood the frame.  It floods the frame down to switch B, and up to device 2.  At the same time switch B records the source MAC in its MAC table.

image

4. In the fourth stage we again have several things happening. 

  1. Switch C has received the same frame for the second time, this time from port 2.  Because it still has not seen the destination device it must flood the frame.  Additionally because this is the exact same frame switch C sees the MAC address of device 1 coming from its right port, port 2, and assumes the device has moved.  This forces switch C to change it’s MAC table. 
  2. At the same time Switch B receives another copy of the frame.  Switch B seeing the same source address must change its MAC table and because it still does not have the destination MAC in the table it must flood the frame again.

image

In the above diagram pay close attention to the fact that the MAC tables have been changed for switch B and C.  Because they saw the same frame come from a different port they must assume the device has moved and change the table.  Additionally because the cycle has not been completed the loop will continue and this is one way broadcast storms begin.  More and more of these endless loops hit the network until there is no bandwidth left to serve data frames. 

In this simple example it may seem that the easy solution is to not build loops like the triangle in my diagram.  This is actually the premise of the next Ethernet evolution we’ll discuss, but first let’s look at how easy it is to create loops just by adding redundancy.

image

In the diagram above we start with a non-redundant switch link.  This link is a single point of failure and in the event a component fails devices on separate switches will be unable to communicate.  The simple solution is adding a second port for redundancy, with the assumed added benefit of having more bandwidth.  In reality without another mechanism in place adding the second link turns the physical view on the bottom left into the logical view on the bottom right which is loop.  This is where the next evolution comes into play.

Spanning-Tree Protocol (STP):

STP is defined in IEEE 802.1d and provides an automated method for building loop free topologies based on a very simple algorithm.  The premise is to allow the switches to automatically configure a loop free topology by placing redundant links in a blocked state.  Like a tree this loop free topology is built up from the root (root bridge) and branches out (switches) to to the leaves (end-nodes) with only one path to get to each end-node.

imageThe way Spanning-tree does this is by detecting redundant links and placing them in a ‘blocked’ state.  This means that the ports do not send or receive frames. In the event of a primary link failure (designated port) the blocked port is brought online.  The issue with spanning-tree is two fold:

  • Because it blocks ports to prevent loops potential bandwidth is wasted.
  • In failure events Spanning-Tree can take up to 50 seconds to bring the blocked port into an active state, this means there is a potential of 50 seconds of down time for the link.

Multiple versions of STP have been implemented and standardized to improve upon the original 802.1d specification.  These include:

Per-VLAN Spanning-Tree Protocol (PVSTP):

Performs the blocking algorithm independently for each VLAN allowing greater bandwidth utilization.

Rapid Spanning-Tree Protocol (RSTP):

Uses additional port-types not in the original STP specification to allow faster convergence during failure events.

Per-VLAN Rapid Spanning-Tree (PVRSTP):

Provides rapid spanning-tree functionality on a per VLAN basis.

Other STP implementations exist and the details of STP operation in each of its flavors is beyond the scope of what I intend to cover with the 101 series.  If there is a demand these concepts may be covered in a more in-depth 202 series once this series is completed.

Summary:

Ethernet networking has evolved quite a bit over the years and is still a work in progress.  Understanding the how’s and why’s of where we are today will help in understanding the advancements that continue to come.  If you have any comments, questions, or corrections please leave them in the comments or contact me in any of the ways listed on the about page.

GD Star Rating
loading...
Data Center 101: Local Area Network Switching, 4.7 out of 5 based on 9 ratings

Comments

  1. I just could not go away your site prior to suggesting that I really loved
    the usual information a person supply to your visitors?
    Is gonna be back regularly in order to inspect new posts

    GD Star Rating
    loading...

Trackbacks

  1. [...] This post was mentioned on Twitter by Aneel, Steve Chambers, Mike Talon, Joe Onisick, Joe Onisick and others. Joe Onisick said: My master's thesis on LAN networking is complete: http://bit.ly/akYkmG I had to stop at Layer 2 for this one because it was already long. [...]

  2. [...] flooding (broadcast) and ‘Flood and Learn behavior.’  I cover some of this behavior here (http://www.definethecloud.net/data-center-101-local-area-network-switching)  but the summary is that when a switch receives a frame for an unknown destination (MAC not [...]

Speak Your Mind

*