The Cloud Storage Argument

The argument over the right type of storage for data center applications is an ongoing battle.  This argument gets amplified when discussing cloud architectures both private and public.  Part of the reason for this disparity in thinking is that there is no ‘one size fits all solution.’  The other part of the problem is that there may not be a current right solution at all.

When we discuss modern enterprise data center storage options there are typically five major choices:

  • Fibre Channel (FC)
  • Fibre Channel over Ethernet (FCoE)
  • Internet Small Computer System Interface (iSCSI)
  • Network File System (NFS)
  • Direct Attached Storage (DAS)

In a Windows server environment these will typically be coupled with Common internet File Service (CIFS) for file sharing.  Behind these protocols there are a series of storage arrays and disk types that be used to meet the applications I/O requirements.

As people move from traditional server architectures to virtualized servers, and from static physical silos to cloud based architectures they will typically move away from DAS into one of the other protocols listed above to gain the advantages, features and savings associated with shared storage.  For the purpose of this discussion we will focus on these four: FC, FCoE, iSCSI, NFS.

The issue then becomes which storage protocol to use for transport of your data from the server to the disk?  I’ve discussed the protocol differences in a previous post (http://www.definethecloud.net/?p=43) so I won’t go into the details here.  Depending on who you’re talking to it’s not uncommon to find extremely passionate opinions.  There a quite a few consultants and engineers that are hard coded to one protocol or another.  That being said most end-users just want something that works, performs adequately and isn’t a headache to manage.

Most environments currently work on a combination of these protocols, plenty of FC data centers rely on DAS to boot the operating system and NFS/CIFS for file sharing.  The same can be said for iSCSI.  With current options a combination of these protocols is probably always going to be best, iSCSI, FCoE, and NFS/CIFS can be used side by side to provide the right performance at the right price on an application by application basis.

The one definite fact in all of the opinions is that running separate parallel networks as we do today  with FC and Ethernet is not the way to move forward, it adds cost, complexity, management, power, cooling and infrastructure that isn’t needed.  Combining protocols down to one wire is key to the flexibility and cost savings promised by end-to-end virtualization and cloud architectures.  If that’s the case which wire do we choose, and which protocol rides directly on top to transport the rest?

10 Gigabit Ethernet is currently the industries push for a single wire and with good reason:

  • It’s currently got enough bandwidth/throughput to do it (10gigabits using 64b/66b encoding as opposed to FC/Infiniband which currently use 8b/10b with 20% overhead)
  • It’s scaling fast 40GE and 100GE are well on their way to standardization (As opposed to 16G and 32G FC)
  • Everyone already knows and uses it, yes that includes you.

For the sake of argument let’s assume we all agree on 10GE as the right wire/protocol to carry all of our traffic, what do we layer on top?  FCoE, iSCSI, NFS, something else?  Well that is a tough question.  the first part of the answer is you don’t have to decide, this is very important because none of these protocols is mutually exclusive.  The second part of the answer is, maybe none of these is the end-all-be-all long-term solution.  Each current protocol has benefits and draw backs so let’s take a quick look:

  • iSCSI: Block level protocol carrying SCSI over IP.  Works with standard Ethernet but can have performance issues on congested networks, also incurs IP protocol overhead.  iSCSI is great on standard Ethernet networks until congestion occurs, once the network becomes fully utilized iSCSI performance will tend to drop.
  • FCoE: Block level protocol which maintains Fibre Channel reliability and security while using underlying Ethernet.  Requires 10GE or above and DCB (http://www.definethecloud.net/?p=31) capable switches.  FCoE is currently well proven and reliable at the access layer and a fantastic option there, but no current solutions exist to move it up further into the network.  Products are on the road map to push FCoE further into the network but that may not necessarily be the best way forward.
  • NFS: File level protocol which runs on top of UDP or TCP and IP.

And a quick look at comparative performance:

Protocol Performanceimage

While the above performance model is subjective and network tuning and specific equipment will play a big role the general idea holds sound.

One of the biggest factors that needs to be considered when choosing these protocols is block vs. file.  Some applications require direct block access to disk, many databases fall into this category.  As importantly if you want to boot an operating system from disk block level protocol (iSCSI, FCoE) are required.  This means that for most diskless configurations you’ll need to make a choice between FCoE and iSCSI (still within the assumption of consolidating on 10GE.)  Diskless configurations have major benefits in large scale deployments including power, cooling, administration, and flexibility so you should at least be considering them.

If you chosen a diskless configuration and settled on iSCSI or FCoE for your boot disks now you still need to figure out what to do about file shares?  CIFS or NFS are your next decision, CIFS is typically the choice for Windows, and NFS for Linux/UNIX environments.  Now you’ve wound up with 2-3 protocols running to get your storage settled and your stacking those alongside the rest of your typical LAN data.

Now to look at management step back and take a look at block data as a whole.  If you’re using enterprise class storage you’ve got several steps of management to configure the disk in that array.  It varies with vendor but typically something to the effect of:

  1. Configure the RAID for groups of disks
  2. Pool multiple RAID groups
  3. Logically sub divide the pool
  4. Assign the logical disks to the initiators/servers
  5. Configure required network security (FC zoning/ IP security/ACL, etc)

While this is easy stuff for storage and SAN administrators it’s time consuming, especially when you start talking about cloud infrastructures with lots and lots of moves adds and changes.  It becomes way to cumbersome to scale into petabytes with hundreds or thousands of customers.  NFS has more streamlined management but it can’t be used to boot an OS.  This makes for extremely tough decisions when looking to scale into large virtualized data center architectures or cloud infrastructure.

There is a current option that allows you to consolidate on 10GE, reduce storage protocols and still get diskless servers.  I
t’s definitely not the solution for every use case (there isn’t one), and it’s only a great option because there aren’t a whole lot of other great options.

In a fully virtualized environment NFS is a great low management overhead protocol for Virtual Machine disks.  Because it can’t boot we need another way to get the operating system to server memory.  That’s where PXE Boot comes in.  Pre eXecutionEnvironment (PXE) is a network OS boot that works well for small operating systems, typically terminal clients or Linux images.  It allows for a single instance of the operating system to be stored on a PXE server attached to the network, and a diskless server to retrieve that OS at boot time.  Because some virtualization operating systems (Hypervisors) are light weight, they are great candidates for PXE boot.  This allows the architecture below.

PXE/NFS 100% Virtualized Environment

image

Summary:

While there are  several options for data center storage none of them solves every need.  Current options increase in complexity and management as the scale of the implementation increases.  Looking to the future we need to be looking for better ways to handle storage.  Maybe block based storage has run it’s course, maybe SCSI has run it’s course, either way we need more scalable storage solutions available to the enterprise in order to meet the growing needs of the data center and maintain manageability and flexibility.  New deployments should take all current options into account and never write off the advantages of using more than one, or all of them where they fit.

GD Star Rating
loading...

Storage Protocols

Storage is a major consideration for cloud initiatives; what type of disk, which vendor, and as importantly which protocol?  Experts will tout one over the other based on cost, performance, throughput, etc.  Let’s take a look at the major storage protocols at play in the data center:

Small Computer System Interface (SCSI):

SCSI is the dominant block level access method for disk in the data center.  Blocks are typically the smallest unit that can be read or written to on a disk, they exist in various sizes depending on disk type and usage.  Block level access means that the server can directly access the disk blocks without the need for a file system in place on top of them, this is opposite of file-based storage discussed later.

SCSI has been in use since the early 1980’s and was originally used to move data within a single server.  The operating system handles writing data using the SCSI protocol to a SCSI drive controller which managed one or more devices on a SCSI cable within a system chassis.  The SCSI controller ensured that only one device would be active on the cable at any time which prevents contention on the SCSI bus.  Because SCSI was managed by a single controller and contained within a system the chance for data loss, or contention were minimal, this meant that SCSI did not require control mechanisms to handle data loss or contention as with networked protocols. SCSI itself is still widely used in its native format but it has also been encapsulated into other protocols for use within storage networks for consolidated storage.

Fibre Channel (FC):

Fibre Channel was designed to extend the functionality of SCSI into point-to-point, loop, and switched topologies.  This allows for longer distances as well as storage consolidation.  FC encapsulates SCSI data and Command Descriptor Blocks (CDB) into the payload of Fibre Channel frames.  Fibre Channel networks provided the addressing, routing, and flow-control required to support SCSI data.  Additionally Fibre Channel networks are designed to meet the needs of SCSI by providing ‘lossless’ in order delivery.  This means that in a stable network FC frames will not be dropped, and are delivered in order ensuring that the Upper Layer Protocols (ULP) will not be forced to reorder or resend frames.

Fibre Channel networks are typically carried over fiber-optic links on dedicated infrastructures.  These infrastructures are traditionally built-in pairs as exact mirrors of one another.  This provides complete physical redundancy end-to-end.  Additionally these networks provide high bandwidth and low-latency.  FC networks come in 1/2/4/8 Gbps speeds with 16/32 Gbps in the works.  Additionally 10Gbps FC links are typically available on a proprietary basis for links between switches.

internet/IP Small Computer System Interface (iSCSI):

iSCSI takes SCSI data and CDBs and places it in the payload of IP packets.  This allows the SCSI protocol to be extended across existing IP infrastructures.  While IP is routable within the data center and across the WAN iSCSI is not traditionally used/supported over routed boundaries (exceptions do exist.)  The draw of iSCSI has been that storage data can be extended across the existing infrastructure with minimal additional cost.

iSCSI has not gained the market share many have predicted over the years due to flaws in the protocol and limitations of the traditional Ethernet based data center networks.  until the standardization of 10 Gigabit Ethernet most data centers relied on 1GE links which were typically saturated already.  This meant implementing iSCSI required new switching infrastructure.  10GE has changed the bandwidth limits but still not catapulted iSCSI into the mainstream.  There are several reasons for this, one being that there is large existing investment in Fibre Channel, and two being the iSCSI protocol itself.

The problem with iSCSI from a protocol standpoint is that it takes the SCSI protocol which expects lossless, in-order delivery, and places it in TCP/IP packets which are designed to support heterogeneous WAN networks and experience packet loss and out-of-order delivery frequently.  This is done without providing any additional tools to either SCSI or TCP/IP for handling the SCSI payloads in the expected fashion.  This in no way means iSCSI is unusable or should be written off it just means that additional considerations must be made when designing iSCSI, especially in the Enterprise or larger environment.

In order to provide proper performance for iSCSI on shared networks Quality of Service (QoS), physical architecture, and jumbo frame support must be taken into account.  Because of these considerations many iSCSI networks have traditionally been placed on separate network hardware from the data center LAN (isolated iSCSI networks.)  This has minimized some of the benefits of consolidating on a single protocol.  With 10 Gigabit Ethernet and the standardization of Data Center Bridging (DCB) iSCSI looks more promising for a greater audience.  For more information on DCB see my previous post (http://www.definethecloud.net/?p=31.)

Fibre Channel over Ethernet (FCoE):

FCoE was ratified in 2009 and provides the functionality for moving native Fibre Channel across consolidated Ethernet networks.  FcoE relies on the DCB standards referenced above.  FCoE encapsulates full Fibre Channel frames inside Ethernet Jumbo Frame payloads.  Utilizing jumbo frames ensure that the FC frame is not fragmented or changed in any way.  The FCoE and DCB standards provide a robust tool set for consolidating existing Fibre Channel workloads on shared 10GE networks while providing the lossless, in-order delivery SCSI expects.  FCoE does not modify the existing Fibre Channel protocol suite and allows for the same management model including zoning, LUN masking, etc.  FCoE has started gaining ground over the last two years pushed by several large hardware vendors in the storage, network, and server markets.  For more information on FCoE see my post (http://www.definethecloud.net/?p=80.)

Common Internet File System (CIFS):

CIFS is a file based storage system based on Small Message block (SMB.)  This is a shared storage protocol typically used in Microsoft environments for file sharing.  Windows-based file shares rely on CIFS as the transfer protocol of the file level data.  File based storage relies on an underlying files system such as FAT32, XFS, NTFS or otherwise which differs from block based storage which does not.  File level storage is an excellent medium for some applications but is not traditionally effective in others.  When an application needs direct block access to disk file based storage is not appropriate.  Deployments that fall into this category include some databases and most Operating Systems.

Network File System (NFS):

NFS is another file based storage protocol.  NFS is traditionally used in Linux and Unix environments.  NFS is also a widely used protocol for VMware environments and can offer several benefits for virtual machine storage.  As a file based storage protocol NFS experiences many of the same limitations as stated for CIFS above.

Hyper Text Transfer Protocol (HTTP) and others:

When the cloud discussion leaves the data center (private/internal cloud) and moves up to the service provider level such as Google, Amazon, or the TelCos the protocols listed above may not have the necessary scalability.  When you begin talking about supporting thousands of customers with multiple Terabytes each, traditional storage protocols may not suffice.  It has to do with both the scalability of the systems and the administration of the disk.  iSCSI and FC both require a fair amount of management for the RAID, volumes, and LUNs, whereas CIFS and NFS require a fair amount for the security and volumes.  Protocols such as HTTP based storage are being used to simplify storage configuration and increase its scalability.

Which is the right protocol to use when moving to the cloud?  Obviously there is only one answer!  As always in IT ‘it depends.’  Each protocol has it’s uses, benefits and drawbacks.  The most important thing to remember is that most environments can benefit from more than one or all of these protocols.  Every application is different and any given protocol may have advantages for a particular app.  The only universal truth in cloud storage is that protocol flexibility will be key.

GD Star Rating
loading...