What Network Virtualization Isn’t

Brad Hedlund recently posted an excellent blog on Network Virtualization.  Network Virtualization is the label used by Brad’s employer VMware/Nicira for their implementation of SDN.  Brad’s article does a great job of outlining the need for changes in networking in order to support current and evolving application deployment models.  He also correctly points out that networking has lagged behind the rest of the data center as technical and operational advancements have been made. 

Network configuration today is laughably archaic when compared to storage, compute and even facilities.  It is still the domain of CLI wizards hacking away on keyboards to configure individual devices.  VMware brought advancements like resource utilization based automatic workload migration to the compute environment.  In order to support this behavior on the network an admin must ensure the appropriate configuration is manually defined on each port that workload may access and every port connecting the two.  This is time consuming, costly and error prone. Brad is right, this is broken.

Brad also correctly points out that network speeds, feeds and packet delivery are adequately evolving and that the friction lies in configuration, policy and service delivery.  These essential network components are still far too static to keep pace with application deployments.  The network needs to evolve, and rapidly, in order to catch up with the rest of the data center.

Brad and I do not differ on the problem(s), or better stated: we do not differ on the drivers for change.  We do however differ on the solution.  Let me preface in advance that Brad and I both work for HW/SW vendors with differing solutions to the problem and differing visions of the future.  Feel free to write the rest of this off as mindless dribble or vendor Kool Aid, I ain’t gonna hate you for it.

Brad makes the case that Network Virtualization is equivalent to server virtualization, and from this simple assumption he poses it as the solution to current network problems.

Let’s start at the top: don’t be fooled by emphatic statements such as Brad’s stating that network virtualization is analogous to server virtualization.  It is not, this is an apples and oranges discussion.  Network virtualization being the orange where you must peel the rind to get to the fruit.  Don’t take my word for it, one of Brad’s colleagues, Scott Lowe, a man much smarter then I says it best:

image

The issue is that these two concepts are implemented in a very different fashion.  Where server virtualization provides full visibility and partitioning of the underlying hardware, network virtualization simply provides a packet encapsulation technique for frames on the wire.  The diagram below better illustrates our two fruits: apples and oranges.

image

As the diagram illustrates we are not working with equivalent approaches.  Network virtualization would require partitioning of switch CPU, TCAM, ASIC forwarding, bandwidth etc. to be a true apples-to-apples comparison.  Instead it provides a simple wrapper to encapsulate traffic on the underlying Layer 3 infrastructure.  These are two very different virtualization approaches.

Brad makes his next giant leap in the “What is the Network section.”  Here he makes the assumption that the network consists of only virtual workloads “The “network” we want to virtualize is the complete L2-L7 services viewed by the virtual machines” and the rest of his blog focuses there.  This is fine for those data center environments that are 100% virtualized including servers, services and WAN connectivity and use server virtualization for all of those purposes.  Those environments must also lack PaaS and SaaS systems that aren’t built on virtual servers as those are also non-applicable to the remaining discussion.  So anyone in those environments described will benefit from the discussion, anyone <crickets>.

So Brad and, presumably VMware/Nicira (since network virtualization is their term), define the goal as taking “all of the network services, features, and configuration necessary to provision the application’s virtual network (VLANs, VRFs, Firewall rules, Load Balancer pools & VIPs, IPAM, Routing, isolation, multi-tenancy, etc.) – take all of those features, decouple it from the physical network, and move it into a virtualization software layer for the express purpose of automation.”  So if your looking to build 100% virtualized server environments with no plans to advance up the stack into PaaS, etc. it seems you have found your Huckleberry.

What we really need is not a virtualized network overlay running on top of an L3 infrastructure with no communication or correlation between the two.  What we really need is something another guy much smarter than me (Greg Ferro) described:

image

Abstraction, independence and isolation, that’s the key to moving the network forward.  This is not provided by network virtualization.  Network virtualization is a coat of paint on the existing building.  Further more that coat of paint is applied without stripping, priming, or removing that floral wall paper your grandmother loved.  The diagram below is how I think of it.

Network Virtualization

With a network virtualization solution you’re placing your applications on a house of cards built on a non-isolated infrastructure of legacy design and thinking.  Without modifying the underlying infrastructure, network virtualization solutions are only as good as the original foundation.  Of course you could replace the data center network with a non-blocking fabric and apply QoS consistently across that underlying fabric (most likely manually) as Brad Hedlund suggests below.

image

If this is the route you take, to rebuild the foundation before applying network virtualization paint, is network virtualization still the color you want?  If a refresh and reconfigure is required anyway, is this the best method for doing so? 

The network has become complex and unmanageable due to things far older than VMware and server virtualization.  We’ve clung to device centric CLI configuration and the realm of keyboard wizards.  Furthermore we’ve bastardized originally abstracted network constructs such as VLAN, VRF, addressing, routing, and security tying them together and creating a Frankenstein of a data center network.  Are we surprised the villagers are coming with torches and pitch forks?

So overall I agree with Brad, the network needs to be fixed.  We just differ on the solution, I’d like to see more than a coat of paint.  Put lipstick on a pig and all you get is a pretty pig.

lipstick pig

GD Star Rating
loading...

CloudStack Graduates to Top-Level Apache Project

The Apache Software Foundation announced in late March that CloudStack is now a top-level project. This is a promotion from CloudStack’s incubator status, where it had lived after being released as open source by Citrix.

This promotion provides additional encouragement to companies and developers looking to contribute to the project, because it validates the CloudStack community and demonstrates ongoing support under the Apache Software Foundation. To read more visit the full article.

GD Star Rating
loading...

OpenStack Video Cage Match With Colin McNamara

This post is a little late, mainly because I’m both lazy and distracted.  That being said I hope you’ll enjoy this video of Colin McNamara (@colinmcnamara) and I debating the merits of OpenStack.  For more Engineer’s unplugged goodness from Amy Lewis (@commsninja) visit: http://blogs.cisco.com/datacenter/.

GD Star Rating
loading...

The App on the Crap (An SDN Story)

I’m feeling Seussish again and looking to tackle SDN this time.  If you missed my first go it was on Hadoop: Horton Hears Hadoop.  Here’s another run:

 

The app could not flow

Net was too slow to change.

It sat on the server

Waiting on admin for change.

 

It sat there quite idly

Customers did too

The dev thought, “How I wish

They’d let my app through!”

 

Too slow to adapt

Too rigid and strict.

The business can’t move.

And that’s my verdict.

 

So all they could do was to

Sit!

   Sit!

      Sit!

         Sit!

The dev did not like it.

Not one little bit.

 

And then

Someone spoke UP!

How that speech gave us PUMP!

 

We listened!

And we heard it move into the hype!

We listened!

A network of SDN type!

The message quite clear,

“You’ve got no need to gripe.”

 

“I know it is slow

and the network is messy.

There is a fix

With software that’s dressy!”

 

“I know some good tricks we can use,”

SDN gal said.

“A header or two,”

Said the gal with the plan.

“Controllers as well.

I will show them to you.

Your CTO

Will not mind if I do.”

 

Then app and dev

Did not know what to say.

The CTO was out playing golf

For the day.

 

But the net admin said, “No!

Make that gal go away!

"Tell the SDN gal

You do NOT want to play.

She should not be here.

She should not be about.

She should not be here

When the CTO is out!”

 

“Now! Now! Have no fear.

Have no fear!” Said the gal.

“My tricks are not bad,”

Said the SDN gal.

“Why you’ll have

So many options from me,

With some tricks that I call

Virtualization you see!”

 

“Stop this nonsense!” admin said.

“We don’t need to scale!

Stop this nonsense!” Admin said.

“The net cannot fail!”

 

“Have no fear!” said the gal.

“I will not let net fail.

I will make it dynamic

And people will hail.

Its changes are quick!

It grows very fast!

But there is much more it can do!”

 

“Look at it!

Look at it now said the gal.”

“With a new overlay

And control from a pal!

It can adapt very fast!

It’s managed quite nicely!

The scale is much greater!

And admin less dicey!

And look!

You can change flows from here!

But there is more dear!

Oh, no.

There is more dear…

 

“Look at it!

Look at it!

Look at it now!

It’s better you see

But you have to know how.

How it can adapt

And respond to new apps!

How it grows to scale!

And helps those dev chaps!

Can grow past those VLANs

And direct traffic, see!

We wrap Layer two

In Layer three IP!

And we route the IP!

As we grow big from small!

But that is not all.

Oh, no.

That is not all….”

 

That’s what the gal said…

Then the net went dead!

The apps all went down

From out at the NOC.

The developers,

Watched with eyes open in shock!

 

And the admin cried out.

With a loud angry shot!

He said, “Do I like this?

Oh no! I do not.

This is not a good trick,”

Said the admin with grit.

“no I don’t like it,

Not one little bit!”

 

“Now look what you did!”

Said admin to gal.

“Now look at this net!

Look at this mess now pal!

You brought down the apps,

Crashed services too

You cost us some sales

And caused lost revenue.

You SHOULD NOT be here

When the CTOs not.

Get out of the data center!”

Admin said from his spot.

 

“But I like to be here.

Oh, I like it a lot”

Said the SDN girl

To the admin she shot.

“I will not go away.

I do not wish to go!

And so,” said the SDN girl,

“So

    So

       So…

I will show you

Another good trick that I know!”

 

And then she ran out.

And, then fast as a fox,

The SDN gal

Came back with a box.

 

A big green wood box.

It was shut with a hook.

“Now look at this trick,”

Said the gal.

“Take a look!”

 

Then she got up on top

And with no rationale.

“I call this game SDN-IN-A-BOX,”

Said the gal.

“In this box are four things

I will show to you now.

You will like these four things.”

Said the gal with a bow.

 

“I will pick up the hook.

You will see something new.

Four things. And I call them

The SDN glue.

These things will not harm you.

They want to move frames.”

Then, out of the box

Came her SDN claims!

And they came out quite fast.

They said, “Are you ready?

Now should we get started

Let’s get going already!”

 

The devs and the apps

Did not know what to do.

So they sat and they watched

Watched the SDN glue.

They stood in their shock

But the admin said “No!

Those things should not be

On this net! Make them go!”

 

“They should not be here

When the CTOs not!

Put them out! Put them out!”

Admin yelled with a shot.

 

“Have no fear, Mr. admin,”

Said the SDN gal.

“These things are good things

And good for morale.”

“They’re great.  Oh so great!

They have come to fix things.

They will give back control

To the network today.”

 

“The first is an overlay,

Number two a vSwitch

But that’s only halfway.”

Was the gals latest pitch.

 

“We’ll next need control

For the flows as they go.

Something to manage

Those flows as they flow.

But there’s still one more piece

Of this SDN madness.

Device management system

To avoid admin sadness.”

 

Then the SDN gal

Said with conviction

“We aren’t quite done yet

There’s one more restriction.

We must tie these together

In a cohesive fashion,

If we do not

It’s all stormy weather.

We will organize things

With apps at the center

And let those developers

For once spread their wings.”

 

“You see in the past,”

Said the SDN gal.

“The net was restrictive

the apps were in hell.

Now we change things around

Put the apps back in focus.

Using these tricks,

And some good hocus pocus.

With a sprinkle of tears

From the unicorn clan,

And a dash of fine dust

A pixie put in this can.

We’ll accomplish the task.”

SDN gal said as she drank from her flask.

 

And lo and behold,

The network sprang back.

The packets were flowing,

TCP sent it’s ACK.

The admin stood shocked,

As he used the controller.

With this type of thing,

He would be the high roller!

He gaped in amazement

At the tenancy scale.

No longer 4000,

It was net holy grail.

 

The apps back online,

As CTO entered.

A disaster avoided, he was left with no sign.

Of the mess that had happened,

While he was out and about.

But the faint sound of snoring

SDN girl drunk and passed out.

GD Star Rating
loading...

Network Overlays: An Introduction

While network overlays are not a new concept, they have come back into the limelight, thanks to drivers brought on by large-scale virtualization. Several standards have been proposed to enable virtual networks to be layered over a physical network infrastructure: VXLAN, NVGRE, and SST. While each proposed standard uses different encapsulation techniques to solve current network limitations, they share some similarities. Let’s look at how network overlays work in general…

To see the full article visit: http://www.networkcomputing.com/next-gen-network-tech-center/network-overlays-an-introduction/240144228

GD Star Rating
loading...

Why We Need Network Abstraction

The move to highly virtualized data centers and cloud models is straining the network. While traditional data center networks were not designed to support the dynamic nature of today’s workloads, the fact is, the emergence of highly virtualized environments is merely exposing issues that have always existed within network constructs. VLANs, VRFs, subnets, routing, security, etc. have been stretched well beyond their original intent. The way these constructs are currently used limits scale, application expansion, contraction and mobility.  To read the full article visit: http://www.networkcomputing.com/next-gen-network-tech-center/why-we-need-network-abstraction/240142588

GD Star Rating
loading...

Data Center Overlays 101

I’ve been playing around with Show Me (www.showme.com) as a tool to add some white boarding to the blog.  Here’s my first crack at it covering Data Center Network overlays.

GD Star Rating
loading...

NVGRE

The most viable competitor to VXLAN is NVGRE which was proposed by Microsoft, Intel, HP and Dell.  It is another encapsulation technique intended to allow virtual network overlays across the physical network.  Both techniques also remove the scalability issues with VLANs which are bound at a max of 4096.  NVGRE uses Generic Routing Encapsulation (GRE) as the encapsulation method.  It uses the lower 24 bits of the GRE header to represent the Tenant Network Identifier (TNI.)  Like VXLAN this 24 bit space allows for 16 million virtual networks. 

image

While NVGRE provides optional support for broadcast via IP multi-cast, it does not rely on it for address learning as VXLAN does.  It instead leaves that up to an as of yet undefined control plane protocol.  This control plane protocol will handle the mappings between the “provider” address used in the outer header to designate the remote NVGRE end-point and the “customer” address of the destination.  The lack of reliance of flood and learn behavior replicated over IP multicast potentially makes NVGRE a more scalable solution.  This will be dependent on implementation and underlying hardware.

Another difference between VXLAN and NVGRE will be within its multi-pathing capabilities.  In its current format NVGRE will provides little ability to be properly load-balanced by ECMP.  In order to enhance load-balancing the draft suggests the use of multiple IP addresses per NVGRE host, which will allow for more flows.  This is a common issue with tunneling mechanisms and is solved in VXLAN by using a hash of the inner frame as the UDP source port.  This provides for efficient load balancing by devices capable of 5-tuple balancing decisions.  There are other possible solutions proposed for NVGRE load-balancing, we’ll have to wait and see how they pan out. 

The last major difference between the two protocols is the use of jumbo frames.  VXLAN is intended to stay within a data center where jumbo frame support is nearly ubiquitous, therefore it assumes that support is present and utilizes it.  NVGRE is intended to be able to be used inter-data-enter and therefore allows for provisions to avoid fragmentation.

Summary:

While NVGRE still needs much clarification it is backed by some of the biggest companies in IT and has some potential benefits.  With the VXLAN capable hardware world expanding quickly you can expect to see more support for NVGRE.  Layer 3 encapsulation techniques as a whole solve the issues of scalability inherent with bridging.  Additionally due to their routed nature they also provide for loop free multi-pathed environments without the need for techniques such as TRILL and technologies based on it.  In order to reach the scale and performance required by tomorrows data centers our networks need change, overlays such as these are one tool towards that goal.

GD Star Rating
loading...

Stateless Transport Tunneling (STT)

STT is another tunneling protocol along the lines of the VXLAN and NVGRE proposals.  As with both of those the intent of STT is to provide a network overlay, or virtual network running on top of a physical network.  STT was proposed by Nicira and is therefore not surprisingly written from a software centric view rather than other proposals written from a network centric view.  The main advantage of the STT proposal is it’s ability to be implemented in a software switch while still benefitting from NIC hardware acceleration.  The other advantage of STT is its use of a 64 bit network ID rather than the 32 bit IDs used by NVGRE and VXLAN.

The hardware offload STT grants relieves the server CPU of a significant workload in high bandwidth systems (10G+.)  This separates it from it’s peers that use an IP encapsulation in the soft switch which negate the NIC’s LSO and LRO functions.   The way STT goes about this is by having the software switch inserts header information into the packet to make it look like a TCP packet, as well as the required network virtualization features.  This allows the guest OS to send frames up to 64k to the hypervisor which are encapsulated and sent to the NIC for segmentation.  While this does allow for the HW offload to be utilized it causes several network issues due to it’s use of valid TCP headers it causes issues for many network appliances or “middle boxes.” 

STT is not expected to be ratified and is considered by some to have been proposed for informational purposes, rather than with the end goal of a ratified standard.  With its misuse of a valid TCP header it would be hard pressed for ratification.  STT does bring up the interesting issue of hardware offload.  The IP tunneling protocols mentioned above create extra overhead on host CPUs due to their inability to benefit from NIC acceleration techniques.  VXLAN and NVGRE are intended to be implemented in hardware to solve this problem.  Both VXLAN and NVGRE use a 32 bit network ID because they are intended to be implemented in hardware, this space provides for 16 million tenants.  Hardware implementation is coming quickly in the case of VXLAN with vendors announcing VXLAN capable switches and NICs. 

GD Star Rating
loading...

Something up Brocade’s Sleeve, and it looks Good

Brocade’s got some new tricks up their sleeve and they look good.  For far too long Brocade fought against convergence to protect its FC install base and catch up.  This bled over into their Ethernet messaging and hindered market growth and comfort levels there.  Overall they appeared as a company missing the next technology waves and clinging desperately to the remnants of a fading requirement: pure storage networks.  That has all changed, Brocade is embracing Ethernet and focusing on technology innovation that is relevant to today’s trends and business.

The Hardware:

Brocade’s VDX 8770 (http://www.brocade.com/downloads/documents/data_sheets/product_data_sheets/vdx-8770-ds.pdf) is their flagship modular switch for Brocade VCS fabrics.  While at first I scoffed at the idea of bigger chassis switches for fabrics, it turns out I was wrong (happens often.)  I forgot about scale.  These fabrics will typically be built in core/edge or spine leaf/designs, often using End of Row (EoR) rather than Top of Rack (ToR) designs to reduce infrastructure.  This leaves max scalability bound by a combination of port count and switch count dependent on several factors such as interconnect ports.  Switch count will typically be limited by fabric software limitations either real or due to testing and certification processes.  Having high density modular fabric-capable switches helps solve scalability issues.

Some of the more interesting features:

  • Line-rate 40GE
  • “Auto-trunking” ISLs (multiple links between switches will bond automatically.)
  • Multi-pathing at layers 1, 2 and 3
  • Dynamic port-profile configuration and migration for VM mobility
  • 100GE ready
  • 4us latency with 4TB switching capacity
  • Support for 384,000 MAC addresses per fabric for massive L2 scalability
  • Support for up to 8000 ports in a VCS fabric
  • 4 and 8 slot chassis options
  • Multiple default gateways for load-balancing routing

The Software:

The real magic is Brocade’s fabric software.  Brocade looks at the fabric as the base on which to build an intelligent network, SDN or otherwise.  As such the fabric should be: resilient, scalable and easy to manage.  In several conversations with people at Brocade it was pointed out that SDN actually adds a management layer.  No matter how you slice it the SDN software overlays a physical network that must be managed.  Minimizing configuration requirements at this level simplifies the network overall.  Additionally the fabric should provide multi-pathing without link blocking for maximum network throughput. 

Brocade executes on this with VCS fabric.  VCS provides an easy to set up and manage fabric model.  Operations like adding a link for bandwidth are done with minimal configuration through tools like “auto-trunking.’  Basically ports identified as fabric ports will be built into the network topology automatically.  They also provide impressive scalability numbers with support for 384,000 MACs, 352,000 IPv4 routes, 88,000 IPv6 routes, and 8000 ports.

One surprise to me was that Brocade is doing this using custom silicon.  With companies like Arista and Nicira (now part of VMware) touting commodity hardware as the future, why is Brocade spending money on silicon?  The answer is in latency.  If you want to do something at line-rate it must be implemented in hardware.  Merchant silicon is adept at keeping cutting edge at things like switching latency and buffering but is slow to implement new features.  This is due to addressable market.  Merchant silicon manufacturers want to ensure that the cost of hardware design and manufacturing will be recouped through bulk sale to multiple manufacturers.  This means features must have wide applicability and typically be standards driven before being implemented.

Brocade saw the ability to innovate with features while maintaining line-rate as an advantage worth the additional cost.  This allows Brocade to differentiate themselves, and their fabric, from vendors relying solely on merchant silicon.  Additionally they position they’re fabric as enough of an advantage to be worth the additional cost when implementing SDN for reasons listed above.

Summary:

Brocade is making some very smart moves and coming out from under the FC rock.  The technology is relevant and timely, but they will still have an uphill battle gaining the confidence of network teams.  They will have to rely on their FC data center heritage to build that confidence and expand their customer base.  The key now will be in execution, it will be an exciting ride.

GD Star Rating
loading...