Data Center 101: Server Virtualization

Virtualization is a key piece of modern data center design.  Virtualization occurs on many devices within the data center, conceptually virtualization is the ability to create multiple logical devices from one physical device.  We’ve been virtualizing hardware for years:  VLANs and VRFs on the network, Volumes and LUNs on storage, and even our servers were virtualized as far back as the 1970s with LPARs. Server virtualization hit mainstream in the data center when VMware began effectively partitioning clock cycles on x86 hardware allowing virtualization to move from big iron to commodity servers. 

This post is the next segment of my Data Center 101 series and will focus on server virtualization, specifically virtualizing x86/x64 server architectures.  If you’re not familiar with the basics of server hardware take a look at ‘Data Center 101: Server Architecture’ (http://www.definethecloud.net/?p=376) before diving in here.

What is server virtualization:

Server virtualization is the ability to take a single physical server system and carve it up like a pie (mmmm pie) into multiple virtual hardware subsets. 

imageEach Virtual Machine (VM) once created, or carved out, will operate in a similar fashion to an independent physical server.  Typically each VM is provided with a set of virtual hardware which an operating system and set of applications can be installed on as if it were a physical server.

Why virtualize servers:

Virtualization has several benefits when done correctly:

How does virtualization work?

Typically within an enterprise data center servers are virtualized using a bare metal installed hypervisor.  This is a virtualization operating system that installs directly on the server without the need for a supporting operating system.  In this model the hypervisor is the operating system and the virtual machine is the application. 

image

Each virtual machine is presented a set of virtual hardware upon which an operating system can be installed.  The fact that the hardware is virtual is transparent to the operating system.  The key components of a physical server that are virtualized are:

image

At a very basic level memory and disk capacity, I/O bandwidth, and CPU cycles are shared amongst each virtual machine.  This allows multiple virtual servers to utilize a single physical servers capacity while maintaining a traditional OS to application relationship.  The reason this does such a good job of increasing utilization is that your spreading several applications across one set of hardware.  Applications typically peak at different times allowing for a more constant state of utilization.

For example imagine an email server, typically an email server is going to peak at 9am, possibly again after lunch, and once more before quitting time.  The rest of the day it’s greatly underutilized (that’s why marketing email is typically sent late at night.)  Now picture a traditional backup server, these historically run at night when other servers are idle to prevent performance degradation.  In a physical model each of these servers would have been architected for peak capacity to support the max load, but most of the day they would be underutilized.  In a virtual model they can both be run on the same physical server and compliment one another due to varying peak times.

Another example of the uses of virtualization is hardware refresh.  DHCP servers are a great example, they provide an automatic IP addressing system by leasing IP addresses to requesting hosts, these leases are typically held for 30 days.  DHCP is not an intensive workload.  In a physical server environment it wouldn’t be uncommon to have two or more physical DHCP servers for redundancy.  Because of the light workload these servers would be using minimal hardware, for instance:

If this physical server were 3-5 years old replacement parts and service contracts would be hard to come by, additionally because of hardware advancements the server may be more expensive to keep then to replace.  When looking for a refresh for this server, the same hardware would not be available today, a typical minimal server today would be:

The application requirements haven’t changed but hardware has moved on.  Therefore refreshing the same DHCP server with new hardware results in even greater underutilization than before.  Virtualization solves this by placing the same DHCP server on a virtualized host and tuning the hardware to the application requirements while sharing the resources with other applications.

Summary:

Server virtualization has a great deal of benefits in the data center and as such companies are adopting more and more virtualization every day.  The overall reduction in overhead costs such as power, cooling, and space coupled with the increased hardware utilization make virtualization a no-brainer for most workloads.  Depending on the virtualization platform that’s chosen there are additional benefits of increased uptime, distributed resource utilization, increased manageability.

SMT, Matrix and Vblock: Architectures for Private Cloud

Cloud computing environments provide enhanced scalability and flexibility to IT organizations.  Many options exist for building cloud strategies, public, private etc.  For many companies private cloud is an attractive option because it allows them to maintain full visibility and control of their IT systems.  Private clouds can also be further enhanced by merging private cloud systems with public cloud systems in a hybrid cloud.  This allows some systems to gain the economies of scale offered by public cloud while others are maintained internally.  Some great examples of hybrid strategies would be:

Many more options exist and any combination of options is possible.  If private cloud is part of the cloud strategy for a company there is a common set of building blocks required to design the computing environment.

image

In the diagram above we see that each component builds upon one another.  Starting at the bottom we utilize consolidated hardware to minimize power, cooling and space as well as underlying managed components.  At the second tier of the private cloud model we layer on virtualization to maximize utilization of the underlying hardware while providing logical separation for individual applications. 

If we stop at this point we have what most of today’s data centers are using to some extent or moving to.  This is a virtualized data center.  Without the next two layers we do not have a cloud/utility computing model.  The next two layers provide the real operational flexibility and organizational benefits of a cloud model.

To move out virtualized data center to a cloud architecture we next layer on Automation and Monitoring.  This layer provides the management and reporting functionality for the underlying architecture.  It could include: monitoring systems, troubleshooting tools, chargeback software, hardware provisioning components, etc.  Next we add a provisioning portal to allow the end-users or IT staff to provision new applications, decommission systems no longer in use, and add/remove capacity from a single tool.  Depending on the level of automation in place below some things like capacity management may be handled without user/staff intervention.

The last piece of the diagram above is security.  While many private cloud discussions leave security out, or minimize its importance it is actually a key component of any cloud design.  When moving to private cloud customers are typically building a new compute environment, or totally redesigning an existing environment.  This is the key time to design robust security in from end-to-end because you’re not tied to previous mistakes (we all make them)or legacy design.  Security should be part of the initial discussion for each layer of the private cloud architecture and the solution as a whole.

Private cloud systems can be built with many different tools from various vendors.  Many of the software tools exist in both Open Source and licensed software versions.  Additionally several vendors have private cloud offerings of an end-to-end stack upon which to build design a private cloud system.  The remainder of this post will cover three of the leading private cloud offerings:

Scope: This post is an overview of three excellent solutions for private cloud.  It is not a pro/con discussion or a feature comparison.  I would personally position any of the three architectures for a given customer dependant on customer requirements, existing environment, cloud strategy, business objective and comfort level.  As always please feel free to leave comments, concerns or corrections using the comment form at the bottom of the post.

Secure Multi-Tenancy (SMT):

Vendor positioning:  ‘This includes the industry’s first end-to-end secure multi-tenancy solution that helps transform IT silos into shared infrastructure.’

image

SMT is a pairing of: VMware vSphere, Cisco Nexus, UCS, MDS, and NetApp storage systems.  SMT has been jointly validated and tested by the three companies, and a Cisco Validated Design (CVD) exists as a reference architecture.  Additionally a joint support network exists for customers building or using SMT solutions.

Unlike the other two systems SMT is a reference architecture a customer can build internally or along with a trusted partner.  This provides one of the two unique benefits of this solution.

Unique Benefits:

HP Matrix:

Vendor positioning:  ‘The industry’s first integrated infrastructure platform that enables you to reduce capital costs and energy consumption and more efficiently utilize the talent of your server administration teams for business innovation rather than operations and maintenance.’

prod-shot-170x190

Matrix is a integration of HP blades, HP storage, HP networking and HP provisioning/management software.  HP has tested the interoperability of the proven components and software and integrated them into a single offering. 

Unique benefits:

Vblock:

Vendor positioning:  ‘The industry's first completely integrated IT offering that combines best-in-class virtualization, networking, computing, storage, security, and management technologies with end-to-end vendor accountability.’

image

Vblocks are a combination of EMC software and storage storage, Cisco UCS, MDS and Nexus, and VMware virtualization.  Vblocks are complete infrastructure packages sold in one of three sizes based on number of virtual machines.  Vblocks offer a thoroughly tested and jointly supported infrastructure with proven performance levels based on a maximum number of VMs. 

Unique Benefits:

Summary:

Private cloud can provide a great deal of benefits when implemented properly, but like any major IT project the benefits are greatly reduced by mistakes and improper design.  Pre-designed and tested infrastructure solutions such as the ones above provide customers a proven platform on which they can build a private cloud.

Why You’re Ready to Create a Private Cloud

I’m catching up on my reading and ran into David Linthicum’s ‘Why you're not ready to create a private cloud’ (http://www.infoworld.com/d/cloud-computing/why-youre-not-ready-create-private-cloud-458.)  It’s a great article and points out a major issue with private-cloud adoption – internal expertise.  The majority of data center teams don’t have the internal expertise required to execute effectively on private-cloud architectures.  This isn’t a knock on these teams, it’s a near impossibility to have and maintain that internal expertise.  Remember back when VMware was gaining adoption.  Nobody had virtualization knowledge so they learned it on the fly.  As people became experts many times they left the nest where they learned it in search of bigger better worms.  More importantly because it was a learn-as-you-go process the environments were inherently problematic and were typically redesigned several times to maximize performance and benefit.

Looking at the flip side of that coin, what is the value to the average enterprise or federal data center in retaining a private cloud architect?  If they’re good at their job they only do it once.  Yes there will be optimization and performance assessments to maintain it, but that’s not typically a full time job. The question becomes:  Because you don’t have the internal expertise to build a private cloud should you ignore the idea or concept?  I would answer a firm no.

The company I work for has the ability, reseller agreements, service offerings and expertise to execute on private clouds.  We’re capable of designing and deploying these solutions from the data center walls to the provisioning portals with experts on hand that have experience in each aspect, and enough overlap to tie it all together.  To put our internal capabilities in perspective one of my companies offerings is private cloud containers and ruggedized deployable private cloud racks.  These aren’t throw some stuff in a box solutions they are custom designed containers outfitted with shock absorption, right-sized power/cooling, custom rack rails providing full equipment serviceability and private cloud IT architectures built on several industry leading integrated platforms. That’s a very unique home grown offering for a systems integrator (typically DC containers are the space of IBM, Sun, etc.)  I accepted this position for these reasons, among others. 

This is not an advertisement for my company but instead an example of why you’re ready to build private cloud infrastructures.  You should not expect to have the internal expertise to architect and build a private cloud infrastructure, you should utilize industry experts to assist with your transition.  There are two major methods of utilizing experts to assess, design, and deploy a private cloud: a capable reseller/solutions provider or a capable consultant/consulting firm.  Both methods have pros and cons.

Reseller/Systems Integrator:

Utilizing a reseller and systems integrator has some major advantages in the form of what is provided at no cost and having a one stop shop for design, purchase, and deployment.  Typically when working with a reseller much of the upfront consulting and design is provided free, this is because it is considered pre-sales and the hardware sale is where they make their money.  With complex systems and architectural designs such as Virtual Desktop Infrastructures (VDI) and cloud architectures don’t expect everything to be cost free, but good portions will be.  These type of deployments require in depth assessment and planning sessions, some of which will be paid engagements but are typically low overall cost and vital to success.  For example you won’t deploy VDI successfully without first understanding your applications in depth.  Application assessments are extended technical engagements.

Another advantage of using a reseller is that the hardware design, purchase and and installation can all be provided from the same company.  This simplifies the overall process and provides the ever so important ‘single-neck-to-choke.’  If something isn’t right before, during or after the deployment a good reseller will be able to help you coordinate actions to repair the issue without you having to call 10 separate vendors.

Lastly a reseller of sufficient size to handle private cloud design and migration will have an extensive pool of technical resources to draw upon during the process both internally and through vendor relationships, which means the team your working with has back-end support in several disciplines and product areas.

There are also some potential downsides to using a reseller that you’ll want to fully understand.  First a reseller typically partners with a select group of vendors that they’ve chosen.  This means that the architectural design will tend to revolve around those vendors.  This is not necessarily a  bad thing as long as:

Obviously a reseller is in the business of making a sale, but a good reseller will leverage their industry knowledge and vendor relationships to build the right solution for the customer. Another note is even if your reseller doesn’t partner with a specific vendor, they should be able to make appropriate arrangements to include anything you choose in your design.

Consultant/Consulting Firm:

Utilizing a consultant is another good option for designing and deploying a private-cloud.  A good consultant can help assess the existing environment/organization and begin to map out an architecture and road map to build the private cloud.  One advantage of a consultant will be the vendor independence you’ll have with an independent consultant or firm.  Once they’ve helped you map out the architecture and roadmap they can typically work with you during with the purchase process through vendors or resellers.

Some potential drawbacks to independent consultants will be identifying a reliable individual or team with the proper capabilities to outline a cloud strategy. The best bet here to minimize risk here will be to use references from colleagues that have made the transition, trusted vendors, etc.  Excellent cloud architecture consultants exist, you’ll just need to find the right fit.

Hybrid Strategy:

These two options are never mutually exclusive.  In many cases I’d recommend working with a trusted reseller and utilizing an independent consultant as well.  There are benefits to this approach, one the consultant can assist to ‘keep the reseller honest’ and additionally should be able to provide alternative opinions and design considerations.

Summary:

Migrating to cloud is not an overnight process and most likely not something that can be planned for, designed and implemented using all internal resources.  When making the decision to move to cloud utilize the external resources available to you. As one last word of caution, don’t even bother looking at cloud architectures until your ready to align your organization to the flexibility provided by cloud, a cloud architecture is of no value to a silo driven organization (see my post ‘The Organizational Challenge’ for more detail: http://www.definethecloud.net/?p=122)

FlexFabric – Small Step, Right Direction

Note: I've added a couple of corrections below thanks to Stuart Miniman at Wikibon (http://wikibon.org/wiki/v/FCoE_Standards)  See the comments for more.

I’ve been digging a little more into the HP FlexFabric announcements in order to wrap my head around the benefits and positioning.  I’m a big endorser of a single network for all applications, LAN, SAN, IPT, HPC, etc. and FCoE is my tool of choice for that right now.  While I don’t see FCoE as the end goal, mainly due to limitations on any network use of SCSI which is the heart of FC, FCoE and iSCSI, I do see FCoE as the way to go for convergence today.  FCoE provides a seamless migration path for customers with an investment in Fibre Channel infrastructure and runs alongside other current converged models such as iSCSI, NFS, HTTP, you name it.  As such any vendor support for FCoE devices is a step in the right direction and provides options to customers looking to reduce infrastructure and cost.

FCoE is quickly moving beyond the access layer where it has been available for two years now.  That being said the access layer (server connections) is where it provides the strongest benefits for infrastructure consolidation, cabling reduction, and reduced power/cooling.  A properly designed FCoE architecture provides a large reduction in overall components required for server I/O.  Let’s take a look at a very simple example using standalone servers (rack mount or tower.)

imageIn the diagram we see traditional Top-of-Rack (ToR) cabling on the left vs. FCoE ToR cabling on the right.  This is for the access layer connections only.  The infrastructure and cabling reduction is immediately apparent for server connectivity.  4 switches, 4 cables, 2-4 I/O cards reduced to 2, 2, and 2.  This is assuming only 2 networking ports are being used which is not the case in many environments including virtualized servers.  For servers connected using multiple 1GE ports the savings is even greater.

Two major vendor options exist for this type of cabling today:

Brocade:

Note: Both Brocade data sheets list support for CEE which is a proprietary pre-standard implementation of DCB which is in the process of being standardized with some parts ratified by the IEEE and some pending.  The terms do get used interchangeably so whether this is a typo or an actual implementation will be something to discuss with your Brocade account team during the design phase.  Additionally Brocade specifically states use for Tier 3 and ‘some Tier 2’ applications which suggests a lack of confidence in the protocol and may suggest a lack of commitment to support and future products.  (This is what I would read from it based on the data sheets and Brocade’s overall positioning on FCoE from the start.)

Cisco:

Note: The Nexus 7000 currently only supports the DCB standard, not FCoE.  FCoE support is planned for Q3CY10 and will allow for multi-hop consolidated fabrics.

Taking the noted considerations into account any of the above options will provide the infrastructure reduction shown in the diagram above for stand alone server solutions.

When we move into blade servers the options are different.  This is because Blade Chassis have built in I/O components which are typically switches.  Let’s look at the options for IBM and Dell then take a look at what HP and FlexFabric bring to the table for HP C-Class systems.

IBM:

Note: Versions of the Nexus 4000 also exist for HP and Dell blades but have not been certified by the vendors, currently only IBM supports the device.  Additionally the Nexus 4000 is a standards compliant DCB switch without FCF capabilities, this means that it provides the lossless delivery and bandwidth management required for FCoE frames along with FIP snooping for FC security on Ethernet networks, but does not handle functions such as encapsulation and de-encapsulation.  This means that the Nexus 4000 can be used with any vendor FCoE forwarder (Nexus or Brocade currently) pending joint support from both companies.

Dell

Both Dell and IBM offer Pass-Through technology which will allow blades to be directly connected as a rack mount server would.  IBM additionally offers two other options: using the Qlogic and BNT switches to provide FCoE capability to blades, and using the Nexus 4000 to provide FCoE to blades. 

Let’s take a look at the HP options for FCoE capability and how they fit into the blade ecosystem.

HP:

On the surface FlexFabric sounds like the way to go with HP blades, and it very well may be, but let’s take a look at what it’s doing for our infrastructure/cable consolidation.

image

With the FlexFabric solution FCoE exists only within the chassis and is split to native FC and Ethernet moving up to the Access or Aggregation layer switches.  This means that while reducing the number of required chassis switch components and blade I/O cards from four to two there has been no reduction in cabling.  Additionally HP has no announced roadmap for a multi-hop FCoE device and their current offerings for ToR multi-hop are OEM Cisco or Brocade switches.  Because the HP FlexFabric switch is a Qlogic switch this means any FC or FCoE implementation using FlexFabric connected to an existing SAN will be a mixed vendor SAN which can pose challenges with compatibility, feature/firmware disparity, and separate management models.

HP’s announcement to utilize the Emulex OneConnect adapter as the LAN on motherboard (LOM) adapter makes FlexFabric more attractive but the benefits of that LOM would also be recognized using the 10GE Pass-Through connected to a 3rd party FCoE switch, or a native Nexus 4000 in the chassis if HP were to approve and begin to OEM he product.

Summary:

As the title states FlexFabric is definitely a step in the right direction but it’s only a small one.  It definitely shows FCoE commitment which is fantastic and should reduce the FCoE FUD flinging.  The main limitation is the lack of cable reduction and the overall FCoE portfolio.  For customers using, or planning to use VirtualConnect to reduce the management overhead of the traditional blade architecture this is a great solution to reduce chassis infrastructure.  For other customers it would be prudent to seriously consider the benefits and drawbacks of the pass-through module connected to one of the HP OEM ToR FCoE switches.

Data Center 101: Local Area Network Switching

Interestingly enough 2 years ago I couldn’t even begin to post an intelligent blog on Local Area Networking 101, funny how things change.  That being said I make no guarantees that this post will be intelligent in any way.  Without further ado let’s get into the second part of the Data Center 101 series and discuss the LAN.

I find the best way to understand a technology is to have a grasp on its history and the problems it solves, so let’s take a minute to dive into the history of the LAN.  For the sake of simplicity and real world applicability I’m going to stick to Ethernet as it is the predominant LAN technology in today’s data center environments.  Before we even go into the history we’ll define Ethernet and where it fits on the OSI model.

Ethernet:

Ethernet is a frame based networking technology which is comprised as a set of standards for Layer 1 and 2 of the OSI model.  Ethernet devices use a address called a Media-Access Control Address (MAC) for communication.  MAC addresses are a flat address space which is not routable (can only be used on a flat layer 2 network) and is composed of several components most importantly a vendor ID known as an Organizational Unique Identifier (OUI) and a unique address for the individual port.

OSI Model:

The Open-Systems Interconnection (OSI) model is a sub-division of the components of communication that is used as a tool to create interoperable network systems and is a fantastic model for learning networks.  The OSI model breaks into 7-Layers much like my favorite taco dip.

image Understanding the OSI model and where protocols and hardware fit into it will not only help you learn but also help with understanding new technologies and how they fit together.   I often revert back to placing concepts in terms of the OSI model when having highly technical discussions about new concepts and technology.  The beauty of the model is that it allows for easy interoperability and flexibility.  For instance Ethernet is still Ethernet whether you use Fiber cables or copper cables because only Layer 1 is changing.

Ethernet LAN History:

As the LAN networks we use today evolved they typically started with individual groups within an organization.  For instance a particular group would have a requirement for a database server and would purchase a device to connect that group.  Those devices were commonly a hub.

Hub:

A network hub is a device with multiple ports used to connect several devices for the purposes of network communication.  When an Ethernet hub receives a frame it replicates it to all connected ports except the one it received it on in a process called flooding.  All connected devices receive a copy of the frame and will typically only process the frame if the destination MAC address is their own (there are exceptions to this which are beyond the scope of this discussion.)

image

In the diagram above you see a single device sending a frame and that frame being flooded to all other active ports.  This works quite well for small networks consisting of a single hub and low port count, but you can easily see where problems start to arise as the network grows.

image

Once multiple hubs are connected and the network grows each hub will flood every frame, and all devices will receive these frames regardless of whether they are the intended recipient.  This causes major overhead in the network due to the unneeded frames consuming bandwidth.

Bridge:

The next step in the network evolution is called bridging and was designed to alleviate this problem and decrease the overhead of forwarding unneeded frames.  A bridge is a device that makes an intelligent decision on when and where to flood frames based on MAC addresses stored in a table.  These MAC addresses can be static (manually input) or dynamic (learned on the fly.)  Because it is more common we will focus on dynamic.  The original bridges were typically 2 or more ports (low port counts) and could separate MAC addresses using the table for those ports.

image

In the above diagram you see a hub operating normally on the left flooding the frame to all active ports.  When the frame is received by the bridge a MAC address lookup is done on the MAC table and the bridge makes a decision whether or not to flood to the other side of the network.  Because the frame in this example is destined for a MAC address existing on the left side of the network the bridge does not flood the frame.  These addresses will be learned dynamically as devices send frames.  If the destination MAC address had been a device on the right side of the network the bridge would have sent the frame to that side to be flooded by the hub.

Bridges reduced unnecessary network traffic between groups or departments while allowing resource sharing when needed.  The limitation of original bridges came from the low port counts and changing data patterns.  Because the bridges were typically only separating 2-4 networks there was still quite a bit of flooding, especially when more and more resources were shared across groups.

Switches:

Switches are the next evolution of bridges and the operation they perform is still considered bridging.  In very basic terms a switch is a high port-count bridge that is able to make decisions on a port-by-port basis.  A switch maintains a MAC table and only forwards frames to the appropriate port based on the destination MAC.  If the switch has not yet learned the destination MAC it will flood the frame.  Switches and bridges will also flood multi-cast (traffic destined for multiple recipients) and broadcast (traffic destined for all recipients) frames which are beyond the scope of this discussion.

image

In the diagram above I have added several components to clarify switching operations now that we are familiar with basic bridging. Starting in the top left of the diagram you see some of the information that is contained in the header of an Ethernet frame.  In this case it is the source and destination MAC addresses of two of the devices connected to the switch.  Each end-point in the above diagram is labeled with a MAC address starting with AF:AF:AF:AF:AF.  In the top right we see a representation of a MAC table which is stored on the switch and learned dynamically.  The MAC table contains a listing of which MAC addresses are known to be on each port.  Because the MAC table in this example is fully populated we can assume that the switch has previously seen a frame from each device in order to populate the table.  That auto population is the ‘dynamic learning’ and it is done be recording the source MAC address of incoming frames.  Lastly we see that the frame being sent by the device on port 1 is only being forwarded to the device on port 2.  In the event port 2’s MAC address had not yet been learned the switch would be forced to flood the frame to all ports except the one it received it on in order to ensure it was received by the destination device.

So far we’ve learned that bridges improved upon hubs, and switches improved upon basic bridging.  The next kink in the evolution of Ethernet LANs came as our networks grew beyond single switches and we began adding in redundancy.

The three issues that arose can all be grouped as problems with network loops (specifically Layer 2 Ethernet loops.)  These issues are:

Multiple Frame Copies:

When a device receives the same frame more than once due to replication or loop issues it is a multiple frame copy.  This can cause issues for some hardware and software and also consumes additional unnecessary bandwidth.

MAC Address Instability:

When a switch must repeatedly change its MAC table entry for a given device this is considered MAC address instability.

Broadcast Storms:

Broadcast storms are the most serious of the three issues as they can literally bring all traffic to a halt.  If you ask someone who has been doing networking for quite some time how they troubleshoot a broadcast storm you are quite likely to hear ‘Unplug everything and plug things back in one at a time until you find the offending device.’  The reason for this is that in the past the storm itself would soak up all available bandwidth leaving no means to access switching equipment in order to troubleshoot the issue.  Most major vendors now provide protection against this level of problem but storms are still a serious problem that can have a major performance impact on production data.  Broadcast storms are caused when a broadcast, multi-cast or flooded frame is repeatedly forwarded and replicated by one or more switches.

 image In the diagram above we can see a switched loop.  We can also observe several stages of frame forwarding starting with the device 1 in the top left sending a frame to the device 2 in the top right.

  1. Device 1 forwards a frame to device 2.  This one-to-one communication is known as unicast.
  2. The switch on the top left does not yet have device 2 in its MAC table therefore it is forced to flood the frame, meaning replicate the frame to all ports except the one where it was received. 
  3. In stage three we see two separate things occur:
    1. The switch in the top right delivers the frame to the intended device (for simplicities sake we are assuming the switch in the top right already has a MAC table entry for the device.)
    2. The bottom switch having received the frame forwards the frame to the switch in the top right.
  4. The switch in the top right receives the second copy and forwards it based on MAC table delivering the second copy of the same frame to device 2.

 

image

The above example has a little more going on and can become confusing quickly.  For the purposes of this example assume all three switches have blank MAC address tables with no devices known.  Also remember that they are building the MAC table dynamically based on the source MAC address they see in a frame.  To aid in understanding I will fill out the MAC tables at each step.

1. Our first stage is the easy one.  Device 1 forwards a unicast frame to device 2.  Switch A receives this frame on the top port.

image

2. When switch A receives the frame it checks its MAC table for the correct port to forward frames to device 2.  Because its MAC table is currently blank it must flood the frame (replicate it to all ports except the one where it was received.)  As it floods the frame it also records the MAC address and attached port of device 1 because it has seen this MAC as the source in the frame.

image

3. In stage 3 two switches receive the frame and must make decisions. 

  1. Switch C having a blank MAC table must flood the frame.  Because there is only one port other than the one it received it on switch C floods the frame to the only available port, at the same time it records the source MAC address as having been received on its port 1.  
  2. Switch B also receives the frame from switch A, and must make a decision.  Like switch C, switch B has no records in its MAC table and must flood the frame.  It floods the frame down to switch B, and up to device 2.  At the same time switch B records the source MAC in its MAC table.

image

4. In the fourth stage we again have several things happening. 

  1. Switch C has received the same frame for the second time, this time from port 2.  Because it still has not seen the destination device it must flood the frame.  Additionally because this is the exact same frame switch C sees the MAC address of device 1 coming from its right port, port 2, and assumes the device has moved.  This forces switch C to change it's MAC table. 
  2. At the same time Switch B receives another copy of the frame.  Switch B seeing the same source address must change its MAC table and because it still does not have the destination MAC in the table it must flood the frame again.

image

In the above diagram pay close attention to the fact that the MAC tables have been changed for switch B and C.  Because they saw the same frame come from a different port they must assume the device has moved and change the table.  Additionally because the cycle has not been completed the loop will continue and this is one way broadcast storms begin.  More and more of these endless loops hit the network until there is no bandwidth left to serve data frames. 

In this simple example it may seem that the easy solution is to not build loops like the triangle in my diagram.  This is actually the premise of the next Ethernet evolution we’ll discuss, but first let’s look at how easy it is to create loops just by adding redundancy.

image

In the diagram above we start with a non-redundant switch link.  This link is a single point of failure and in the event a component fails devices on separate switches will be unable to communicate.  The simple solution is adding a second port for redundancy, with the assumed added benefit of having more bandwidth.  In reality without another mechanism in place adding the second link turns the physical view on the bottom left into the logical view on the bottom right which is loop.  This is where the next evolution comes into play.

Spanning-Tree Protocol (STP):

STP is defined in IEEE 802.1d and provides an automated method for building loop free topologies based on a very simple algorithm.  The premise is to allow the switches to automatically configure a loop free topology by placing redundant links in a blocked state.  Like a tree this loop free topology is built up from the root (root bridge) and branches out (switches) to to the leaves (end-nodes) with only one path to get to each end-node.

imageThe way Spanning-tree does this is by detecting redundant links and placing them in a ‘blocked’ state.  This means that the ports do not send or receive frames. In the event of a primary link failure (designated port) the blocked port is brought online.  The issue with spanning-tree is two fold:

Multiple versions of STP have been implemented and standardized to improve upon the original 802.1d specification.  These include:

Per-VLAN Spanning-Tree Protocol (PVSTP):

Performs the blocking algorithm independently for each VLAN allowing greater bandwidth utilization.

Rapid Spanning-Tree Protocol (RSTP):

Uses additional port-types not in the original STP specification to allow faster convergence during failure events.

Per-VLAN Rapid Spanning-Tree (PVRSTP):

Provides rapid spanning-tree functionality on a per VLAN basis.

Other STP implementations exist and the details of STP operation in each of its flavors is beyond the scope of what I intend to cover with the 101 series.  If there is a demand these concepts may be covered in a more in-depth 202 series once this series is completed.

Summary:

Ethernet networking has evolved quite a bit over the years and is still a work in progress.  Understanding the how’s and why’s of where we are today will help in understanding the advancements that continue to come.  If you have any comments, questions, or corrections please leave them in the comments or contact me in any of the ways listed on the about page.

The Art of Pre-Sales

On a recent customer call being led by a vendor account manager and engineer I witnessed some key mistakes by the engineer as he presented the technology to the customer.  None of the mistakes were glaring or show stopping but they definitely kept the conversation from having the value that was potentially there.  That conversation got me thinking about the skills and principles that need to be applied to pre-sales engineering and prompted this blog.

Pre-sales engineering in all of its many forms is truly an art.  There is definitely science and methodologies behind its success but practicing those methods and studying that science alone won’t get you far past good.  To be great you need to invest effort into the technology, the business, and most importantly you’re personal style.  If you’re already good at pre-sales and don’t care to be great than the rest of this blog won’t help you.  If you’re an ‘end-user’ or customer that deals with pre-sales engineers this blog may help you understand a little of what goes through the heads of the guys on the other side of the conference table.  If your job is post-sales, implementations, managed-services, etc this may give you an idea of what your counterparts are doing.  If you’re a pre-sales engineer who could use some new ideas or tools, this blogs for you.

Joe’s 5 rules of Pre-Sales Engineering:

These are really rules of thumb that I use to get into the right mindset when engaging with customer’s in a pre-sales fashion.  They aren’t set in stone, all encompassing or agreed upon by teams of experts, just tools I use.  Let’s start with a quick look into each rule:

You are a member of the sales team:

This one is key to remember because for a lot of very technical people that move into pre-sales roles this is tough to grasp.  There is not always love, drum circles, group hugs and special brownies between sales and engineering and some engineers tend to resent sales people for various reasons (and vice versa.)  Whether or not there is resentment it’s natural to be proud of your technical skill set and thinking of yourself in a sales perspective may not be something your comfortable with.  Get over it or get out of pre-sales.  As a pre-sales engineer it’s your job to act as a member of the sales team assisting account managers in the sale of the products and services your company provides.  You are there to drive the sales that provide the blanket of revenue the rest of the company rises and sleeps under (if you missed that reference watch the video, it’s worth it: http://bit.ly/dqTzU7.)

You are not a salesman:

Now that you’ve swallowed the fact that you’re a member of the sales team it’s time to enforce the fact that you are not an account manager/sales representative etc.  This is vitally important, in fact if you can apply only the first two rules you’ll be significantly better than some of your peers.  I’m going to use the term AM (Account Manager) for sales from here on out, allow this to encompass any non-technical sales title that fits your role.  An AM and a pre-sales SE are completely different roles with a common goal.  An AM is tightly tied to a target sales number and most likely spends hours on con calls talking about that number and why they are or aren’t at that number.  An AMs core job is to maintain customer relationships and sell what the company sells.

A pre-sales engineers job on the other hand is a totally different beast.  While you do need to support your AM it’s your job to make sure that the product, service or solution you sell is relevant, effective, right-fit, and complete for the particular customer.  In the reseller world we talk about becoming a ‘Trusted Advisor’ but that ‘Trusted Advisor’ is typically a two person team consisting of an AM and Engineer who know the customer well, understand their environment, and maintain a mutually beneficial relationship.

As the engineer side of that perfect team it’s your job to have the IDEA:

Note: Before continuing I have to apologize for the fact that I just created one of those word acronym BS objects…

So what’s the bright IDEA?  A pre-sales engineer you need to identify customer requirements, design a product set or solution to meet those requirements, evangelize the proposed solution, and adjust the solution as necessary with the customer. 

You must be business relevant

This is typically another tough thing to do from an engineer standpoint.  Understanding business requirements and applying the technology to those requirements does not come naturally for most engineers but it is vital to success.  Great technology alone has no value, the data center landscape is littered with stories of great technology companies that failed because they couldn’t capitalize by making the technology business relevant.  The same lesson applies to pre-sales engineering.

To be a great pre-sales engineer you have to understand both business and technology enough to map the technical benefits to actual business requirements.  So what if your widget is faster than all other widgets before it, what does that mean to my business, and my job?  A great way to begin to understand the high level business requirements and what the executives of the companies you sell into are thinking is to incorporate business books and magazines into your reading.  Next time you’re at the airport magazine rack looking at the latest trade rag grab a copy of ‘The Harvard business Review’ instead.

You must be technically knowledgeable:

This part should go without saying but unfortunately is not always adhered to.  It’s way to often I see engineers reading from the slides they present because they don’t know the products or material they are presenting.  Maintaining an appropriate level of technical knowledge becomes harder and harder as more products are thrown at you, but you must do it anyway.   If you can’t speak to the product or solutions features and benefits without slides or data sheets you shouldn’t be speaking about it.

Staying up-to-date is a daunting task but there are a plethora of resources out there for it.  Blogs and twitter can be used as a constant stream of the latest and greatest technical information.  Add to that formal training and vendor documentation and the tools to be technically relevant are there.  The best advice I can offer on staying technically knowledgeable is not being afraid to ask and or say you don’t know.  If you need training ask for it, if you need info find someone who knows it and talk to them.  As importantly work to share your expertise with others as it creates a collaborative environment that benefits everyone.

Know your audience:

This may be the most important of the five rules and boils down to doing your homework and being applicable.  Ensure you’ve researched your customer, their requirements, and their environment as much as possible.  Know what their interests and pain points are before walking into a meeting whenever possible.

Knowing your audience also applies during customer meetings.  As the customer provides more information it’s important to tailor the information you provide to that customers interest on the fly.  Any technical conversation should be a fluid entity ebbing and flowing with the customers feedback.

Practicing the art:

Like any other art pre-sales must be practiced.  You must study the products and services your company sells, develop your presentation skills, and constantly work on your communication.  From my perspective the best way to build all of these skills at once is white boarding.  White boards are the greatest tool in a pre-sales engineers arsenal.  They provide a clean canvas on which you can paint the picture of a solution and remain fluid in any given conversation.  Unlike slides white board sessions are flexible and can easily stay focused on what the customer wants to hear.  I firmly believe that a pre-sales engineer should not discuss any technology they cannot confidently articulate via the whiteboard.  You cannot take this concept far enough, I’ve instructed 5 day data center classes 100% on the white board covering LAN, SAN, storage, servers and networking because it was the right fit for the audience.  The white board is your friend.

If you don’t have a white board in your home get one.  Use it to hone your skills, help visualize architecture, and practice before meetings.  Look through the slides you typically present and practice conveying the same messaging via the white board without cues.  As you become comfortable having technical discussions via the white board you’ll find you can convey a greater level of technical information tailored to the customers needs in a much faster fashion.  White boards also don’t require slides, projectors, or power, they don’t suffer from technical difficulties.

As you white board in front of customers think of painting a picture for them, start with broad strokes outlining the technology and add detail to areas that the customer shows interest in.  Drill down into only the specifics that are relevant to that customer, this is where knowing your audience is key.

image

In the diagram above you can see the way the conversation should go with a customer.  You begin at the top level big picture and drill down into only the points that the customer shows an interest in or are applicable to their data center and job role.  Don’t ever feel the need to discuss every feature of a product or solution because they are not all relevant to every customer.  For instance a server admin probably doesn’t care how fast switching occurs but network and application teams probably do.  Maybe your product can help save a ton of cost, great but that’s probably not very relevant to the administrators who aren’t responsible for budget.  Always ensure you’re maintaining relevance to the audience and the business.

Summary:

Pre-Sales like any other skill set must be honed and practiced.  It doesn’t come overnight and as with anything else, you’re never as good as you can be.  Build a style and methodology that work for you and don’t be afraid to change or modify them as you find areas for improvement.  The better you get at the more value your giving your customer, team, and company.

Data Center 101: Server Systems

As the industry moves deeper and deeper into virtualization, automation, and cloud architectures it forces us as engineers to break free of our traditional silos.  For years many of us were able to do quite well being experts in one discipline with little to no knowledge in another.  Cloud computing, virtualization and other current technological and business initiatives are forcing us to branch out beyond out traditional knowledge set and understand more of the data center architecture as a whole.

It was this concept that gave me the idea to start a new series on the blog covering the foundation topics of each of the key areas of data center.  This will be lessons designed from the ground up to give you a familiarity with a new subject or refresh on an old one.  Depending on your background, some, none, or all of these may be useful to you.  As we get further through the series I will be looking for experts to post on subjects I’m not as familiar with, WAN and Security are two that come to mind.  If you’re interested in writing a beginners lesson in one of those topics, or any other please comment or contact me directly.

Server Systems:

As I’ve said before in previous posts the application is truly the heart of the data center.  Applications themselves are the reason we build servers, networks, and storage systems.  Applications are the email systems, databases, web content, etc that run our businesses.  Applications run within the confines of an operating system which interfaces directly with server hardware and firmware (discussed later) and provides a platform to run the application.  Operating systems come in many types, commonly Unix, Linux, and Windows with other variants used for specialized purposes such as mainframe and super computers.

Because the server itself sits more closely than any other hardware to the application understanding the server hardware and functionality is key.  Server hardware breaks down into several major components and concepts.  For this discussion we will stick with the more common AMD/Intel architectures known as the x86 architecture.

Server

image

The diagram above shows a two socket server.  Starting at the bottom you can see the disks, in this case internal Hard Disk Drives (HDD.)  Moving up you can see two sets of memory and CPU followed by the I/O cards and power supplies.  The power supplies convert A/C current to appropriate D/C current levels for use in the system.  Additionally not shown would be fans to move air through the system for cooling.

The bus systems, which are not shown, would be a series of traces and chips on the system board allowing separate components to communicate.

A Quick Note About Processors:

Processors come in many shapes, sizes, and were traditionally rated by speed measures in hertz.  Over the last few years a new concept has been added to processors, and that is ‘cores.’  Simply put a core is a CPU placed on a chip beside other cores which each share certain components such as cache and memory controller (both outside the scope of this discussion.)  If a processor has 2 cores it will operate as if it was 2 physically independent identical processors and provide the advantages of such.

Another technology has been around for quite some time called hyper threading.  A processor can traditionally only process one calculation per cycle (measured in hertz) this is known as a thread.  Many of these processes only use a small portion of the processor itself leaving other portions idle.  Hyper threading allows a processor to schedule 2 processes in the same cycle as long as they don’t require overlapping portions of the processor.  For applications that are able to utilize multiple threads hyper threading will provide an average of approximately 30% increases whereas a second core would double performance.

Hyper threading and multiple cores can be used together as they are not mutually exclusive.  For instance in the diagram above if both installed processors were 4 core processors, that would provide 8 total cores, with hyper threading enabled it would provide a total of 16 logical cores.

Not all applications and operating systems can take advantage of multiple processors and cores, therefore it is not always advantageous to have more cores or processors.  Proper application sizing and tuning is required to properly match the number of cores to the task at hand.

image

Server Startup:

When a server is first powered on the BIOS is loaded from EEPROM (Electronically Erasable Programmable Read-Only Memory) located on the system board.  While the BIOS is in control it performs a series of Power On Self Tests (POST) ensuring the basic operability of the main system components.  From there it detects and initializes key components such as keyboard, video, mouse, etc.  Last the BIOS searches for a bootable device.  The BIOS searches through available bootable media for a device containing a bootable and valid Master Boot Record (MBR.)  It then loads this and allows that code to take over with the load of the operating system.

The order and devices the BIOS searches is configurable in the BIOS settings.  Typical boot devices are:

Boot order is very important when there is more than one available boot device, for instance when booting to a CD-ROM to perform recovery of an operating system that is installed.  It is also important to note that both iSCSI and Fibre Channel network connected disks are handled by the operating system as if they were internal Small Computer System Interface (SCSI) disks.  This becomes very important when configuring non-local boot devices.  SCSI as a whole will be covered during this series.

Operating System:

Once the BIOS is done getting things ready and has transferred control to the bootable data in the MBR that bootable data takes over.  That is called the operating system (OS.)  The OS is the interface between the user/administrator and the server hardware.  The OS provides a common platform for various applications to run on and handles the interface between those applications and the hardware.  In order to properly interface with hardware components the OS requires drivers for that hardware.  Essentially the drivers are an OS level set of software that allow any application running in the OS to properly interface with the firmware running on the hardware.

Applications:

Applications come in many different forms to provide a wide variety of services.  Applications are the core of the data center and are typically the most difficult piece to understand.  Each application whether commercially available or custom built has unique requirements.  Different applications have different considerations for processor, memory, disk, and I/O.  These considerations become very important when looking at new architectures because any change in the data center can have significant effect on application performance.

Summary:

The server architecture goes from the I/O inputs through the server hardware to the application stack.  Proper understanding of this architecture is vital to application performance and applications are the purpose of the data center.  Servers consist of a set of major components, CPU's to process data, RAM to store data for fast access, I/O devices to get data in and out, and disk to store data in a permanent fashion.  This system is put together for the purpose of serving an application.

This post is the first in a series intended to build the foundation of data center.  If your starting from scratch they may all be useful, if your familiar in one or two aspects then pick and choose.  If this series becomes popular I may do a 202 series as a follow on.  If I missed something here, or made a mistake please comment.  Also if you’re a subject matter expert in a data center area that would like to contribute a foundation blog in this series please comment or contact me.

Have We Taken Data Redundancy too Far?

During a recent conversation about disk configuration and data redundancy on a storage array I began to think about everything we put into data redundancy.  The question that came to mind is the title of this post ‘Have we taken data redundancy too far?’

Now don’t get me wrong, I love data as much as the next fellow, and I definitely understand it’s importance to the business and to compliance. I’m not advocating tossing out redundancy or data protection, etc etc.  My question is when is enough enough, and or is there a better way?

To put this in perspective let’s take a look at everything that stacks up to protect enterprise data:

Disks:

We start with the lowly disk, which by itself has no redundancy.  While disks themselves tend to have one of the highest failure rates in the data center they have definitely come a long way.  Many have the ability to self protect and warn of impending failure at a low level, and they can last for years without issue.

Disks alone are a single point-of-failure in which all data on the disk is lost if the drive fails.  Because of this we’ve worked to come up with better ways to apply redundancy to the data.  The simplest form of this is RAID.

RAID:

RAID is ‘Redundant Array of Inexpensive Disks’, it’s also correct to say it ‘Redundant Array of Independent Disks.’  No matter what you call it RAID allows what would typically be a single disk on its own to act as part of a group of disks for the purposes of redundancy, performance or both.  You can think of this like disk clustering.

Some common RAID types used for redundancy are:

In many cases ‘hot-spares’ will also be added to the RAID groups.  The purpose of a hot'-spare is to have a drive online but not participating in the RAID for failure events.  If a RAID disk fails the hot-spare can be used to replace it immediately until an administrator can swap the bad drive.

Snapshots:

Another level of redundancy many enterprise storage arrays will use is snapshots.  Snapshots can be used to perform point-in-time recoveries.  Basically when a snapshot is taken it locks the associated blocks of data ensuring they are not modified without copying them.  If a block needs to be changed it will be written in a new location without effecting the original.  In order to revert to a snapshot the change data is simply removed leaving the original locked blocks.  While snapshots are not a backup or redundancy feature on their own they can be used as part of other systems, and are excellent for development environments where testing is required on various data sets, etc.  Snapshots consume additional space as two copies are kept of any locked block that is changed.

Primary/Secondary replication:

Another method for creating data redundancy is tiered storage.  In a tiered redundancy model the primary storage serving the applications is held on the highest performing disk and data is backed up or replicated to lower performance less expensive disk or disk arrays.

Virtual Tape Libraries (VTL):

Virtual tape libraries are storage arrays that present themselves as standard tape libraries for the purposes of backup and archiving.  VTL is typically used in between primary storage and actual tape backups as a means of decreasing the backup window.

Tape backups:

In most cases the last stop for backup and archiving is still tape.  This is because tape is cheap, high density, and ultra-portable.  Large amounts of data can be streamed to tape libraries which can store the data and allow tapes to be sent to off-site storage facilities.

Adding it up:

When you put these redundancy and recovery systems together and start layering them on top of one another you end up with high ratios of storage media being purposed for redundancy and recovery compared to the actual data being served.  10:1, 20:1, 100:1 or more is not uncommon when considering archive/redundancy space compared to usable space.

Summary:

My summary is more of a repeat of the same question.  Have we taken this too far?  Do we need protection built in at each level, and layered on top of one another?  Can we afford to continue down this path adding redundancy at the expense of performance and utilization?  Should we throw higher parity RAID at our arrays and make up the performance hit with expensive cache?  Should we purchase 10TB of media for every 1TB we actually need to serve?  Is there a better way?

I don’t have the answer to this one, but would love to see a discussion on it.  The way I’m thinking now is bunches of dumb independent disk pooled and provisioned through software.  Drop the RAID and hot spares, use the software to maintain multiple local or global copies on different hardware.  When you start moving the disk thinking to cloud environments and talking about Petabytes or more of data the current model starts unraveling quickly.

HP’s FlexFabric

There were quite a few announcements this week at the HP Technology Forum in Vegas.  Several of these announcements were extremely interesting, of these the ones that resonated the most with me were:

Superdome 2:

I’m not familiar with the Superdome 1 nor am I in any way an expert on non x86 architectures.  In fact that’s exactly what struck me as excellent about this product announcement.  It allows the mission critical servers that a company chooses to, or must run on non x86 hardware to run right alongside the more common x86 architecture in the same chassis.  This further consolidates the datacenter and reduces infrastructure for customers with mixed environments, of which there are many.  While there is a current push in some customers to migrate all data center applications onto x86 based platforms, this is not: fast, cheap, or good for every use case.  Superdome 2 provides a common infrastructure for both the mission critical applications and the x86 based applications.

For a more technical description see Kevin Houston's Superdome 2 blog: http://bladesmadesimple.com/2010/04/its-a-bird-its-a-plane-its-superdome-2-on-a-blade-server/.

Note: As stated I’m no expert in this space and I have no technical knowledge of the Superdome platform, conceptually it makes a lot of sense and it seems like a move in the right direction.

Common Infrastructure:

There was a lot of talk in some of the key notes about common look feel and infrastructure of the separate HP systems (storage, servers, etc.)  At first I laughed this off as a ‘who cares’ but then I started to think about it.  If HP takes this message seriously and standardizes rail kits, cable management, components (where possible), etc. this will have big benefits for administration and deployment of equipment.

If you’ve never done a good deal of racking/stacking of data center gear you may not see the value here, but I spent a lot of time on the integration side with this as part of my job.  Within a single vendor (or sometimes product line) rail kits for server/storage, rack mounting hardware, etc can all be different.  This adds time and complexity to integrating systems and can sometimes lead to less than ideal systems.  For example the first vBlock I helped a partner configure (for demo purposes only) had the two UCS systems stacked on top of one another on the bottom of the rack with no mounting hardware.  The reason for this was the EMC racks being used had different rail mounts than the UCS system was designed for.  Issues like this can cause problems and delays, especially when the people in charge of infrastructure aren’t properly engaged during purchasing (very common.)

Overall I can see this as a very good thing for the end user.

HP FlexFabric

This is the piece that really grabbed my attention while watching the constant Twitter stream of HP announcements.  HP FlexFabric brings network consolidation to the HP blade chassis.  I specifically say network consolidation, because HP got this piece right.  Yes it does FCoE, but that doesn’t mean you have to.  FlexFabric provides the converged network tools to provide any protocol you want over 10GE to the blades and split that out to separate networks at a chassis level.  Here’s a picture of the switch from Kevin Houston’s blog: http://bladesmadesimple.com/2010/06/first-look-hps-new-blade-servers-and-converged-switch-hptf/.

HP Virtual Connect FlexFabric 10Gb/24-Port Module

The first thing to note when looking at this device is that all the front end uplink ports look the same, so how do they split out Fibre Channel and Ethernet?  The answer is Qlogic (the manufacturer of the switch) has been doing some heavy lifting on the engineering side.  They’ve designed the front end ports to support the optics for either Fibre Channel or 10GE.  This means you’ve got flexibility in how you use your bandwidth.  The ability to do this is an industry first, although the Cisco Nexus 5000 hardware ASIC is capable and has been since FCS it was implemented on a per-module basis rather than per-port basis like this switch. 

The next piece that was quite interesting and really provides flexibility and choice to the HP FlexFabric concept is their decision to use Emulex’s OneConnect adapter as the LAN on Motherboard (LOM.)  This was a very smart decision by HP.  Emulex’s OneConnect is a product that has impressed me from square one, it shows a traditionally Fibre Channel company embracing the fact that Ethernet is the future of storage but not locking the decision into an Upper Layer protocol (ULP.)  OneConnect provides 10GE connectivity, TCP offload, iSCSI offload/boot, and FCoE capability all on the same card, now that’s a converged network!  HP seems to have seen the value there as well and built this into the system board. 

Take a step back and soak that in, LOM has been owned by Intel, Broadcom, and other traditional NIC vendors since the beginning.  Emulex until last year was looked at as one of two solid FC HBA vendors.  As of this week HP announced the ousting of the traditional NIC vendor for a traditional FC vendor on their system board.  That’s a big win for Emulex.  Kudos to Emulex for the technology (and business decisions behind it) and to HP for recognizing that value.

Looking a little deeper the next big piece of this overall architecture is that the whole FlexFabric system supports HP’s FlexConnect technology which allows a server admin to carve up a single physical 10GE link into four logical links which are presented to the OS as individual NICs.

The only drawback I see to the FlexFabric picture is the fact that FCoE is only used within the chassis and split into separate networks from there.  This can definitely increase the required infrastructure depending on the architecture.  I’ll wait to go to deep into that until I hear a few good lines of thinking on why that direction was taken.

Summary:

HP had a strong week in Vegas, these were only a few of the announcements, several others including mind blowing stuff from HP labs (start protecting John Conner now) can be found on blogs and HP’s website.  Of all of the announcements FlexFabric was the one that really caught my attention.  It embraces the idea of I/O consolidation without clinging to FCoE as the only way to do it and it greatly increases the competitive landscape in that market which always benefits the end-user/customer.

Comments, corrections, bitches moans, gripes and complaints all welcome.

Collapsing Server Management Points with UCS

I was invited to post a blog on the WWT virtualization blog, I chose to discuss Cisco Pass-Through switching with UCS.  See the blog there at: http://vblog.wwtlab.com/2010/06/22/collapsing-server-management-points-with-ucs/.

There are several other great posts there you should take a look at.