Technical Drivers for Cloud Computing

In a previous post I’ve described the business drivers for Cloud Computing infrastructures (http://www.definethecloud.net/?p=27.)  Basically the idea of transforming data center from a cost center into a profit center.  In this post I’ll look at the underlying technical challenges that cloud looks to address in order to reduce data center cost and increase data center flexibility.

There are several infrastructure challenges faced by most data centers globally: Power, Cooling, Space and Cabling,  In addition to these challenges data centers are constantly driven to adapt more rapidly and do more with less.  Let’s take a look at the details of these challenges.

Power:

Power is a major data center consideration.  As data centers have grown and hardware has increased in capacity power requirements have exponentially scaled.  This large power usage causes concerns of both cost and of environmental impact.  Many power companies provide incentives for power reduction due to the limits on the power they can produce.  Additionally many governments provide either incentives for power reduction or mandates for the reduction of usage typically in the form of ‘green initiatives.

Power issues within the data center come in two major forms: total power usage, and usage per square meter/foot.  Any given data center can experience either or both of these issues.  Solving one without addressing the other may lead to new problems.

Power problems within the data center as a whole come from a variety of issues such as equipment utilization and how effectively purchased power is used.  A common metric for identifying the latter is Power Usage Effectiveness (PUE.)  PUE is a measure of how much power drawn from the utility company is actually available for the computing infrastructure.  PUE is usually expressed as a Kilowatt ratio X:Y where X is power draw and Y is power that reaches computing equipment such as switches, servers and storage.  The rest is lost to such things as power distribution, battery backup and cooling.  Typically PUE numbers for data centers average 2.5:1 meaning 1.5 KW is lost for every 1 KW delivered to the compute infrastructure.  Moving to state-of-the-art designs has brought a few data centers to 1.2:1 or lower.

Power per square meter/foot is another major concern and increases in importance as compute density increases.  More powerful servers, switches, and storage require more power to run.  Many data centers were not designed to support modern high density hardware such as blades and therefore cannot support full density implementations of this type of equipment.  It’s not uncommon to find data centers with either near empty racks housing a single blade chassis or increased empty floor space in order to support sparsely set fully populated racks.  The same can be said for cooling.

Cooling:

Data center cooling issues are closely tied to the issues with power.  Every watt of power used in the data center must also be cooled, the coolers themselves in turn draw more power.  Cooling also follows the same two general areas of consideration: cooling as a whole and cooling per square meter/foot.

One of the most common traditional data center cooling methods uses forced-air cooling provided under raised floors.  This air is pushed up through the raised floor in ‘cold-aisles’ with the intake side of equipment facing in.  The equipment draws the air through cooling internal components and exhausts into ‘hot-aisles’ which are then vented back into the system.  As data center capacity has grown and equipment density has increased traditional cooling methods have been pushed to or past their limits.

Many solutions exist to increase cooling capacity and or reduce cooling cost.  Specialized rack and aisle enclosures prevent hot/cold air mixing, hot spot fans alleviate trouble points, and ambient outside air can be used for cooling in some geographic locations.  Liquid cooling is another promising method of increasing cooling capacity and/or reducing costs.  Many liquids have a higher capacity for storing heat than air, allowing them to more efficiently pull heat away from equipment.  Liquid cooling systems for high-end devices have existed for years, but more and more solutions are being targeted at a broader market.  Solutions such as horizontal liquid racks allow off-the-shelf-traditional servers to be fully immersed in mineral oil based solutions that have a high capacity for transferring heat and are less conductive than dry wood.

Beyond investing in liquid cooling solutions or moving the data center to Northern Washington there are  tools that can be used to reduce data center cooling requirements.  One method that works effectively is that equipment can be run at higher temperatures to reduce cooling cost with acceptable increases in mean-time-to-failure for components.  The most effective solution for reducing cooling is reducing infrastructure.  The ‘greenest’ equipment is the equipment you don’t ever bring into the data center, less power drawn equates directly to less cooling required.

Space:

Space is a very interesting issue because it’s all about who you are and more importantly, where you are.  For instance many companies started their data centers in locations like London, Tokyo and New York because that’s where they were based.  Those data centers pay an extreme premium for the space they occupy.  Using New York as an example many of those companies could save hundreds of dollars per month moving the data center across the Hudson with little to no loss in performance.

That being said many data centers require high dollar space because of location.  As an example ‘Market data’ is all about latency (time to receive or transmit data) every micro-second counts.  These data centers must be in financial hubs such as London and New York.  Other data centers may pay less per square meter/foot but could reduce costs by reducing space.  In either event reducing space reduces overhead/cost.

Cabling:

Cabling is often a pain point understood by administrators but forgotten by management.  Cabling nightmares have become an accepted norm of rapid change in a data center environment.  The reason cabling has such a potential for neglect is that it’s been an unmanageable and or not understood problem.  Engineers tend to forget that a ‘rat’s nest’ of cables behind the servers/switches or under the floor tiles hinder cooling efficiency.  To understand this think of the back of the last real-world server rack you saw and the cables connecting those servers.  Take that thought one step further and think about the cables under the floor blocking what may be primary cold air flow.

When thinking about cabling it’s important to remember the key points: Each cable has a purchase cost, each cable has a power cost, and each cable has a cooling cost.  Regardless of complex metrics to quantify those three on a total basis it’s easy to see that reducing cables reduce cost.

Taking all four of those factors in mind and producing a solution that provides benefits for each is the goal of cloud computing.  If you solve one problem by itself you will most likely increase another.  Cloud computing is a tool to reduce infrastructure and cabling within a Small-to-Medium-Business (SMB) all the way up to a global enterprise.  At the same time cloud-infrastructures support faster adoption times for business applications.  Say that how you will, but ‘cloud’ has the potential to reduce cost while increasing ‘mean-time-to-market’ ‘business-agility’ ‘data-center flexibility’ or any other term you’d like to apply.  Cloud is simply the concept of rethinking the way we do IT today in order to meet the challenges of the way we do business today.  If right now you’re asking ‘why aren’t we/they all doing it’ then stay tuned for my next post on the challenges of adopting cloud architectures.

GD Star Rating
loading...

Storage Protocols

Storage is a major consideration for cloud initiatives; what type of disk, which vendor, and as importantly which protocol?  Experts will tout one over the other based on cost, performance, throughput, etc.  Let’s take a look at the major storage protocols at play in the data center:

Small Computer System Interface (SCSI):

SCSI is the dominant block level access method for disk in the data center.  Blocks are typically the smallest unit that can be read or written to on a disk, they exist in various sizes depending on disk type and usage.  Block level access means that the server can directly access the disk blocks without the need for a file system in place on top of them, this is opposite of file-based storage discussed later.

SCSI has been in use since the early 1980’s and was originally used to move data within a single server.  The operating system handles writing data using the SCSI protocol to a SCSI drive controller which managed one or more devices on a SCSI cable within a system chassis.  The SCSI controller ensured that only one device would be active on the cable at any time which prevents contention on the SCSI bus.  Because SCSI was managed by a single controller and contained within a system the chance for data loss, or contention were minimal, this meant that SCSI did not require control mechanisms to handle data loss or contention as with networked protocols. SCSI itself is still widely used in its native format but it has also been encapsulated into other protocols for use within storage networks for consolidated storage.

Fibre Channel (FC):

Fibre Channel was designed to extend the functionality of SCSI into point-to-point, loop, and switched topologies.  This allows for longer distances as well as storage consolidation.  FC encapsulates SCSI data and Command Descriptor Blocks (CDB) into the payload of Fibre Channel frames.  Fibre Channel networks provided the addressing, routing, and flow-control required to support SCSI data.  Additionally Fibre Channel networks are designed to meet the needs of SCSI by providing ‘lossless’ in order delivery.  This means that in a stable network FC frames will not be dropped, and are delivered in order ensuring that the Upper Layer Protocols (ULP) will not be forced to reorder or resend frames.

Fibre Channel networks are typically carried over fiber-optic links on dedicated infrastructures.  These infrastructures are traditionally built-in pairs as exact mirrors of one another.  This provides complete physical redundancy end-to-end.  Additionally these networks provide high bandwidth and low-latency.  FC networks come in 1/2/4/8 Gbps speeds with 16/32 Gbps in the works.  Additionally 10Gbps FC links are typically available on a proprietary basis for links between switches.

internet/IP Small Computer System Interface (iSCSI):

iSCSI takes SCSI data and CDBs and places it in the payload of IP packets.  This allows the SCSI protocol to be extended across existing IP infrastructures.  While IP is routable within the data center and across the WAN iSCSI is not traditionally used/supported over routed boundaries (exceptions do exist.)  The draw of iSCSI has been that storage data can be extended across the existing infrastructure with minimal additional cost.

iSCSI has not gained the market share many have predicted over the years due to flaws in the protocol and limitations of the traditional Ethernet based data center networks.  until the standardization of 10 Gigabit Ethernet most data centers relied on 1GE links which were typically saturated already.  This meant implementing iSCSI required new switching infrastructure.  10GE has changed the bandwidth limits but still not catapulted iSCSI into the mainstream.  There are several reasons for this, one being that there is large existing investment in Fibre Channel, and two being the iSCSI protocol itself.

The problem with iSCSI from a protocol standpoint is that it takes the SCSI protocol which expects lossless, in-order delivery, and places it in TCP/IP packets which are designed to support heterogeneous WAN networks and experience packet loss and out-of-order delivery frequently.  This is done without providing any additional tools to either SCSI or TCP/IP for handling the SCSI payloads in the expected fashion.  This in no way means iSCSI is unusable or should be written off it just means that additional considerations must be made when designing iSCSI, especially in the Enterprise or larger environment.

In order to provide proper performance for iSCSI on shared networks Quality of Service (QoS), physical architecture, and jumbo frame support must be taken into account.  Because of these considerations many iSCSI networks have traditionally been placed on separate network hardware from the data center LAN (isolated iSCSI networks.)  This has minimized some of the benefits of consolidating on a single protocol.  With 10 Gigabit Ethernet and the standardization of Data Center Bridging (DCB) iSCSI looks more promising for a greater audience.  For more information on DCB see my previous post (http://www.definethecloud.net/?p=31.)

Fibre Channel over Ethernet (FCoE):

FCoE was ratified in 2009 and provides the functionality for moving native Fibre Channel across consolidated Ethernet networks.  FcoE relies on the DCB standards referenced above.  FCoE encapsulates full Fibre Channel frames inside Ethernet Jumbo Frame payloads.  Utilizing jumbo frames ensure that the FC frame is not fragmented or changed in any way.  The FCoE and DCB standards provide a robust tool set for consolidating existing Fibre Channel workloads on shared 10GE networks while providing the lossless, in-order delivery SCSI expects.  FCoE does not modify the existing Fibre Channel protocol suite and allows for the same management model including zoning, LUN masking, etc.  FCoE has started gaining ground over the last two years pushed by several large hardware vendors in the storage, network, and server markets.  For more information on FCoE see my post (http://www.definethecloud.net/?p=80.)

Common Internet File System (CIFS):

CIFS is a file based storage system based on Small Message block (SMB.)  This is a shared storage protocol typically used in Microsoft environments for file sharing.  Windows-based file shares rely on CIFS as the transfer protocol of the file level data.  File based storage relies on an underlying files system such as FAT32, XFS, NTFS or otherwise which differs from block based storage which does not.  File level storage is an excellent medium for some applications but is not traditionally effective in others.  When an application needs direct block access to disk file based storage is not appropriate.  Deployments that fall into this category include some databases and most Operating Systems.

Network File System (NFS):

NFS is another file based storage protocol.  NFS is traditionally used in Linux and Unix environments.  NFS is also a widely used protocol for VMware environments and can offer several benefits for virtual machine storage.  As a file based storage protocol NFS experiences many of the same limitations as stated for CIFS above.

Hyper Text Transfer Protocol (HTTP) and others:

When the cloud discussion leaves the data center (private/internal cloud) and moves up to the service provider level such as Google, Amazon, or the TelCos the protocols listed above may not have the necessary scalability.  When you begin talking about supporting thousands of customers with multiple Terabytes each, traditional storage protocols may not suffice.  It has to do with both the scalability of the systems and the administration of the disk.  iSCSI and FC both require a fair amount of management for the RAID, volumes, and LUNs, whereas CIFS and NFS require a fair amount for the security and volumes.  Protocols such as HTTP based storage are being used to simplify storage configuration and increase its scalability.

Which is the right protocol to use when moving to the cloud?  Obviously there is only one answer!  As always in IT ‘it depends.’  Each protocol has it’s uses, benefits and drawbacks.  The most important thing to remember is that most environments can benefit from more than one or all of these protocols.  Every application is different and any given protocol may have advantages for a particular app.  The only universal truth in cloud storage is that protocol flexibility will be key.

GD Star Rating
loading...

Virtualization

While not a new concept virtualization has hit the main stream over the last few years and become a uncontrollable buzz word driven by VMware, and other server virtualization platforms.  Virtualization has been around in many forms for much longer than some realizes, things like Logical partitions (LPAR) on IBM Mainframes have been around since the 80’s and have been extended to other non-mainframe platforms.  Networks have been virtualized by creating VLANs for years.  The virtualization term now gets used for all sorts of things in the data center.  like it or love the term doesn’t look like it’s going away anytime soon.

Virtualization in all of its forms is a pillar of Cloud Computing especially in the private/internal cloud architecture.  To define it loosely for the purpose of this discussion let’s use ‘The ability to divide a single hardware device or infrastructure into separate logical components.

Virtualization is key to building cloud based architectures because it allows greater flexibility and utilization of the underlying equipment.  Rather than requiring  separate physical equipment for each ‘Tenant’ multiple tenants can be separated logically on a single underlying infrastructure.  This concept is also known as ‘multi-tenancy.’  Depending on the infrastructure being designed a tenant can be an individual application, internal team/department, or external customer.  There are three areas to focus on when discussing a migration to cloud computing, servers, network, and storage.

Server Virtualization:

Within the x86 server platform (typically the Windows/Linux environment.) VMware is the current server virtualization leader.  Many competitors exist such as Microsoft’s HyperV and Zen for Linux, and they are continually gaining market share.  The most common server virtualization allows a single physical server to be divided into logical subsets by creating virtual hardware, this virtual hardware can then have an Operating System and application suite installed and will operate as if it were an independent server.  Server virtualization comes in two major flavors, Bare metal virtualization and OS based virtualization.

Bare metal virtualization means that a lightweight virtualization capable operating system is installed directly on the server hardware and provides the functionality to create Virtual Servers.  OS based virtualization operates as an application or service within an OS such as Microsoft Windows that provides the ability to create virtual servers.  While both methods are commonly used Bare Metal virtualization is typically preferred for production use due to the reduced overhead involved.

Server virtualization provides many benefits but the key benefits to cloud environments are: increased server utilization, and operational flexibility.  Increased utilization means that less hardware is required to perform the same computing tasks which reduces overall cost.  The increased flexibility of virtual environments is key to cloud architectures.  When a new application needs to be brought online it can be done without procuring new hardware, and equally as important when an application is decommissioned the physical resources are automatically available for use without server repurposing.  Physical servers can be added seamlessly when capacity requirements increase.

Network Virtualization:

Network virtualization comes in many forms.  VLANs, LSANs, VSANs allow a single physical  LAN or SAN architecture to be carved up into separate networks without dependence on the physical connection.  Virtual Routing and Forwarding (VRF) allows separate routing tables to be used on a single piece of hardware to support different routes for different purposes.  Additionally technologies exist which allow single network hardware components to be virtualized in a similar fashion to what VMware does on servers.  All of these tools can be used together to provide the proper underlying architecture for cloud computing.  The benefits of network virtualization are very similar to server virtualization, increased utilization and flexibility.

Storage Virtualization:

Storage virtualization encompasses a broad range of topics and features.  The term has been used to define anything from the underlying RAID configuring and partitioning of the disk to things like IBMs SVC, and NetApp’s V-Series both used for managing heterogeneous storage.  Without getting into what’s right and wrong when talking about storage virtualization, let’s look at what is required for cloud.

First consolidated storage itself is a big part of cloud infrastructures in most applications.  Having the data in one place to manage can simplify the infrastructure, but also increases the feature set especially when virtualizing servers.  At a top-level looking at storage for cloud environments there are two major considerations: flexibility and cost.  The storage should have the right feature set and protocol options to support the initial design goals, it should also offer the flexibility to adapt as the business requirements change.  Several vendors offer great storage platforms for cloud environments depending on the design goals and requirements.  Features that are typically useful for the cloud (and sometimes lumped into virtualization) are:

De-Duplication – Maintaining a single copy of duplicate data, reducing overall disk usage.

Thin-provisioning – Optimizes disk usage by allowing disks to be assigned to servers/applications based on predicted growth while consuming only the used space.  Allows for applications to grow without pre-consuming disk.

Snapshots – Low disk use point in time record which can be used in operations like point-in-time restores.

Overall virtualization from end-to-end is the foundation of cloud environments, allowing for flexible high utilization infrastructures.

GD Star Rating
loading...

Data Center Bridging

Data Center Bridging (DCB) is a group of IEEE standard protocols designed to support I/O consolidation.  DCB enables multiple protocols with very different requirements to run over the same Layer 2 10 Gigabit Ethernet infrastructure.  Because DCB is currently discussed along with Fibre Channel over Ethernet (FCoE) it’s not uncommon for people to think of them as part of FCoE.  This is not the case, while FCoE relies on DCB for proper treatment on a shared network, DCB enhancements can be applied to any protocol on the network.  DCB support is being built into data center hardware and software from multiple vendors and is fully backwards compatible with legacy systems (no forklift upgrades.)  For more information on FCoE see my post on the subject (http://www.definethecloud.net/?p=80.)

Network protocols typically have unique requirements in regards to latency, packet/frame loss, bandwidth, etc.  These differences have a large impact on the performance of the protocol in a shared environment.  Differences such as flow-control and frame loss are the reason Fibre Channel networks have traditionally been separate physical infrastructures from Ethernet networks.  DCB is the set of tools that allows us to converge these networks without sacrificing performance or reliability.

Lets take a look at the DCB suite:

Priority Flow Control (PFC) 802.1Qbb:

PFC is a flow control mechanism.  PFC is designed to eliminate frame loss for specific traffic types on Ethernet networks.  Protocols such as Small Computer System Interface (SCSI) which is used for block data storage are very sensitive to data loss.  SCSI protocol is the heart of Fibre Channel which is a tool used to extend SCSI from internal disk to centralized storage across a network.  In its native form on dedicated networks Fibre Channel has tools to ensure that frames are not lost as long as the network is stable.  In order to move Fibre Channel across Ethernet networks that same ‘lossless’ behavior must be guaranteed, PFC is the tool to do that.

PFC uses a pause mechanism to allow a receiving device to signal a pause to the directly connected sending device prior to buffer overflow and packet loss.  While Ethernet has had a tool to do this for some time (802.3x pause) it has always been at the link level.  This means that all traffic on the link would be paused, rather than just a selected traffic type.  Pausing a link carrying various I/O types would be a bad thing, especially for traffic such as IP Telephony and streaming video.  Rather than pause an entire link PFC sends a pause signal for a single Class of Service (CoS) which is part of an 802.1Q Ethernet header.  This allows up to 8 classes to be defined and paused independent of one another.

Congestion Management (802.1Qau):

When we begin pausing traffic in a network we have the potential to spread network congestion by causing choke points.  Imagine trying to drive past a football stadium (football or American football pick your flavor) when the game is about to start.  You’re stuck in dead lock traffic even though you’re not going to the game, if you’ve got that image your on the right track.  Congestion management is a set of signaling tools used to push that congestion out of the network core to the network edge (if you’re thinking old school FECN and BECN you’re not far off.)

Bandwidth Management (802.1Qaz):

Bandwidth management is a tool for simple consistent application of bandwidth controls at Layer 2 on a DCB network.  Bandwidth management allows specific traffic type to be guaranteed a percentage of available bandwidth based on its CoS.  For instance on a 10GE network access port utilizing FCoE you could guarantee 40% of the bandwidth to FCoE.  This provides a 4Gb tunnel for FCoE when needed but allows other traffic types to utilize that bandwidth when not in use for FCoE.

Data Center bridging Exchange (DCBX):

DCBX is a Layer 2 communication protocol that allows DCB capable devices to communicate and discover the edge of the DCB network, i.e. legacy devices.  DCBX not only allows passing of information but provides tools for passing configuration.  This is key to the consistent configuration of DCB networks.  For instance a DCB switch acting as a Fibre Channel over Ethernet Forwarder (FCF) can let an attached Converged Network Adapter (CNA) on a server know to tag FCoE frames with a specific CoS and enable pause for that traffic type.

All in all the DCB features are key enablers for true consolidated I/O.  They provide a tool set for each traffic type to be handled properly independent of other protocols on the wire.  For more information on Consolidated I/O see my previous post Consolidated IO (http://www.definethecloud.net/?p=67.)

GD Star Rating
loading...

Business Drivers for Cloud Infrastructures

There are several business challenges that drive the cloud discussion and cloud infrastructure market.  These business challenges are very different from the technical challenges that are more commonly discussed along with cloud.  It’s key to differentiate between the two because typically only one or the other is relevant to any given audience.  If you’re talking to an engineer something like hardware redundancy is quite relevant, but that same concept isn’t relevant to an end-user or CxO.

For this discussion we’ll focus on Business drivers for cloud and save technical demands for a later time.  While thinking about business demands you’ll want to put the data center as a whole in perspective from a business standpoint.  Put on a CxO hat for a minute and decide what data center means to you.  If you’re thinking like many CxO’s you’re thinking of the data center as a cost center, not much different from the cost of paying the lease on a building, or paying taxes.  It’s a necessary expense of doing business.

Recently this has been very true, for instance the business needs a way to communicate more quickly than the typed memo so they invest in an email system, the email system is a cost no different from the paper and ink required for the memos.  This wasn’t always the case, originally Information Technology (IT) was a competitive advantage, remember way back when not everybody had a data center infrastructure?  Back then building a server or network for a business application gave you an edge, lately it’s more of a keeping up with the Jones’s, who by the way are very hard to keep up with.  That brings us to our first business driver for cloud:

Competitive Advantage: The ability to do something, better, faster, or at lower cost than the competition.

Applying that to the cloud: If my competition is thinking/building their IT infrastructure in the traditional methods and paying the price for it what can I do to improve on that?

Now let’s look for some other business drivers, and lets grab the easy ones (‘low hanging fruit.’)  Nearly every business on earth has one common goal, ‘grow the business.’  There are few if any businesses that hit a certain size and say ‘This is just right, let’s stop right here!.’  That only works for Goldilocks.  So then to put this in simple terms let’s assume all businesses want the ability to ‘scale.’  Now that seems easy enough but let’s take that idea one step further: in a good economy I may want to scale out (grow), in a bad economy I may want to scale in (focus on core competencies.)  With that in mind let’s move on to our next business objective:

Ability to scale the business (out and in):  Being able to deploy business applications on demand and retire them when needs change.

Applying that to the cloud: I need to bring new business initiatives online quickly and decommission non-profitable initiatives on-demand.

So now we have two business drivers, and while there are many we don’t have time for a comprehensive list.  Let’s look for one more that is another nearly ubiquitous driver.  In most companies globally, private or publicly traded, there is one major focus and that is profit.  Profit is what can be applied to the owner’s pocket or increase the share value.  Profit is what’s left over after all of the business costs.  What’s an easy way to increase profit?  Reduce cost.

Reduce Costs:  Reducing the amount spent to run the business.  If the goal is increasing profits then costs must be reduced without sacrificing revenue (total amount of money received by a company for goods or services sold.)

Applying that to the cloud: I need to reduce IT overhead without sacrificing business revenue.

So three of the major business drivers that push the various cloud initiatives are: Competitive Advantage, Ability to scale, and Reduction in cost.  These are the real reasons people are looking to cloud architectures of all shapes and sizes in order to redesign the way IT is done.

The most important concept is that cloud is retooling the way we think of IT.  If you think in terms of ‘How can I improve upon the way I run IT now’ you’ll miss the mark.  In order to gain the maximum benefits from cloud infrastructures you need to think ‘What am I trying to do and what’s the best way to do that.’

GD Star Rating
loading...

Consolidated I/O

Consolidated I/O (input/output) is a hot topic and has been for the last two years, but it’s not a new concept.  We’ve already consolidated I/O once in the data center and forgotten about it, remember those phone PBXs before we replaced them with IP Telephony?  The next step in consolidating I/O comes in the form of getting management traffic, backup traffic and storage traffic from centralized storage arrays to the servers on the same network that carries our IP data.  In the most general terms the concept is ‘one wire.’  ‘Cable Once’ or ‘One Wire’ allows a flexible I/O infrastructure with a greatly reduced cable count and a single network to power, cool and administer.

Solutions have existed and been used for years to do this, iSCSI (SCSI storage data over IP networks) is one tool that has been commonly used to do this.  The reason the topic has hit the mainstream over the last 2 years is that 10GB Ethernet was ratified and we now have a common protocol with the proper bandwidth to support this type of consolidation.  Prior to 10GE we simply didn’t have the right bandwidth to effectively put everything down the same pipe.

The first thing to remember when discussing I/O consolidation is that contrary to popular belief I/O consolidation does not mean Fibre Channel over Ethernet (FCoE.)  I/O consolidation is all about using a single infrastructure and underlying protocol to carry any and all traffic types required in the data center.  The underlying protocol of choice is 10G Ethernet because it’s lightweight, high bandwidth and Ethernet itself is the most widely used data center protocol today.  Using 10GE and the IEEE standards for Data Center bridging (DCB) as the underlying data center network, any and all protocols can be layered on top as needed on a per application basis.  See my post on DCB for more information (http://www.definethecloud.net/?p=31.)These protocols can be FCoE, iSCSI, UDP, TCP, NFS, CIFS, etc. or any combination of them all.

If you look at the data center today most are already using a combination of these protocols, but typically have 2 or more separate infrastructures to support them.  A data center that uses Fibre Channel heavily has two Fibre Channel networks (for redundancy) and one or more LAN networks. These ‘Fibre Channel shops’ are typically still using additional storage protocols such as NFS/CIFS for file based storage.  The cost of administering, powering, cooling, and eventually upgrading/refreshing these separate networks continues to grow.

Consolidating onto a single infrastructure not only provides obvious cost benefits but also provides the flexibility required for a cloud infrastructure.  Having a ‘Cable Once’ infrastructure allows you to provide the right protocol at the right time on an application basis, without the need for hardware changes.

Call it what you will I/O Consolidation, Network Convergence, or Network Virtualization, a cable once topology that can support the right protocol at the right time is one of the pillars of cloud architectures in the data center.

GD Star Rating
loading...

What’s a cloud?

So to start things off I thought I’d take a stab at defining the cloud.  This is always an interesting subject because so many people have placed very different labels and definitions on the cloud.  YouTube is filled with videos of high dollar IT talking heads spitting up non-sensical answers as to what cloud is, or in many cases diverting the question and discussing something they understand.  So before we get into what it is let’s talk about why it’s so hard to define?

Part of the difficulty in defining cloud comes from the fact that the term gets its power from being loosely defined.  If you put a strict definition on it, the term becomes useless.  For example put yourself in the shoes of an IT vendor account manager (sales rep), if you are an account manager stay in your own shoes for the next excercise and forget I said sales rep.

Now from those shoes imagine yourself in a meeting with a CxO discussing the data center.  A question such as ‘Have you looked into implementing a cloud strategy and if so what are your goals’ can be quite powerful.  It’s an open-ended question that leaves plenty of room for discussion.  Within that discussion there is a large opportunity to identify business challenges, and begin to narrow down solutions which equates to potential product sales to meet the customers requirements.

If cloud had a strict definition such as ‘Providing an off-site hosted service at a subscription rate’ it would only be applicable to a handful of customers.  Any strict definition of cloud based on size, location, infrastructure requirements, etc. reduces the overall usability of the term.  This doesn’t imply that cloud is just a sales term and should be ignored, but it does make the definition more complicated.

From a sales perspective the value of cloud is the flexibility of the term, which is quite interestingly one of the technical values to the customer (we’ll discuss that later.)  From a customer perspective the real challenge is defining what it means to you.  Service providers such as BT and AT&T will have very different definitions of cloud from Amazon and Google.  Amazon and Google will define cloud differently than SalesForce.com, and they’ll all have totally different definitions than the average enterprise customer.  This is because the definition of cloud is in the eye of the beholder, it’s all about how the concept can be effectively applied to the individual business demands.

Part of the reason engineers tend to cringe when they hear the term cloud is that it has no real meaning to an engineer.  Engineers work with defined terms that are quantifiable, bandwidth, bus speed, port count, disk space. If you use any of those terms in your definition you’ve already missed the point.  To make matters worse the more cloud gets discussed the more confusing it becomes from an engineering standpoint, we started with just cloud and now have: private cloud, internal cloud, public cloud, secure cloud, hybrid cloud, semi-private cloud.  It’s akin to LAN, MAN, WAN, CAN, PAN, GAN to describe various ‘Area networks’ except at least those can move data and cloud is just a term.

Cloud is not an engineering term, cloud is a business term.

Cloud does not solve IT problems, cloud solves business problems.

So to bring a definition to the term let’s stay at 10,000 feet and think conceptually, because that’s what it’s really all about.  Think about the last time you saw a PDF, or drew a whiteboard that discussed moving data across the internet.  Did you draw the complex series of routers and switches, optical mux’s and de-mux’s that moved your packet from London to Beijing?  Of course not, but what did you draw instead, why?   If you’re like most of the world you drew a cloud, you drew that cloud because you don’t care about the underlying infrastructure or complexity.  You know that the web has already been built and it works.  You know that if you put a packet in one end, it will eventually come out the other.  You don’t care how it gets there, only that it does.  The term cloud is no more complex than that, it’s all about putting together an infrastructure that gets the job done without having to dwell on the underlying components.

The point of moving to a cloud architecture is a rethinking of what the data center really does and how it does it in order to alleviate current data center issues without causing new ones.  Start by asking what is the purpose of the data center, and really take some time to think about it.  The entire data center, from the input power through the distribution system and UPS, the cooling, the storage, the network, and the servers are all there to do one thing, run applications.  The application truly is the center of the data center universe.  If you’ve ever been on a help desk team and got a call from a user saying ‘the network is slow’ it wasn’t because they ran a throughput test and found a bottleneck, it was because their email wouldn’t load or it took them an extra 15 seconds to access a database.  The data center itself is there to support the applications that run the business.

That tends to be a hard pill to swallow for people who have spent their lives in networks, or storage, or even server hardware because in reality the only Oscar they could ever receive is best supporting actor, applications are the star of the show.  No company built a network and then decided they should find an application to use up some of that bandwidth.  We’ve built our infrastructures to support the applications we choose to run our business, and that’s where the problems came from.

We’ve built data center infrastructures one app at a time as our businesses grew, adding on where necessary and removing when/if possible.  What we’ve ended up with is a Frankenstein type mess of siloed architecture and one trick pony infrastructure.  We’ve consolidated, and scaled in or out, we’ve virtualized all to try to fix this and we’ve failed.  We’ve failed because we’ve taken a view of the current problems while wearing blinders that prevented seeing the big picture.  The big picture is that businesses require apps, apps come and go, and apps require infrastructure.  The solution is building a flexible, available, high performance platform that adapts to our immediate businesses needs.  A dynamic infrastructure that ties directly to the business needs without the need for procurement or decommission when an app comes up or hits its end-of-life.  That applies to any type of cloud you care to talk about.

In future blogs I’ll describe the technical and business drivers behind cloud solutions, the types of clouds, the risks of cloud, and the technology that enables a move to cloud.  I’ll do my best to remain vendor neutral and keep my opinions out of it as much as possible.  I welcome any and all feedback, comments and corrections.

The cloud doesn’t have to be a pig

GD Star Rating
loading...