I recently ran into some internal buzz about Oracle’s 72 port ‘top-of-rack’ switch announcement and it peeked my interest, so I started taking a look. Oracle selling a switch is definitely interesting on the surface but then again they did just purchase Sun for a bargain basement price and Sun does make hardware, pretty good hardware at that. Here is a quick breakdown of the switch:
|Port Count||72x 10GE or 16x 40GE|
|Oversubscription||None fully non-blocking|
Two three letter words came to mind when I saw this: wow, and why. Wow is definitely in order, I mean wow! Packing 72 non-blocking 10GE ports into a 1RU switch chassis is impressive, very impressive. I’m dying to get a look at the hardware. Now for the why:
Why does Oracle think they can call a 72 port switch a top-of-rack switch? 1RU form factor doth not a ToR make. Do you have 72 10GE ports in a rack in your data center? This switch is really a middle-of-row or end-of-row switch. Once you move it into that position now you’ve got some cabling to think about, $1000.00 or so times 2 per link for optics another couple hundred for that nice long cable x 72, the cost of running and maintaining those cables… think ‘Holy shit Batman my $79,200 ToR switch just became a $200,000+ EoR switch and a different management model from the rest of my shop.’
Why does Oracle think there is a need for full non-blocking bandwidth for every access layer port? Is anyone seriously driving sustained 10GE on multiple devices at once, anyone? You’ve got two options in switching and only one actually makes sense. You either reduce cost and implement oversubscription in hardware, or you pay for full rate hardware that is still oversubscribed in you network designs because you aren’t using 1:1 server to inter-switch links. Before deciding how much you really need line-rate bandwidth do yourself a favor and take a look at your I/O profile across a few servers for a week or two. If you’re like the majority of data centers you’ll find that you’ll be quite fine with as much as 8:1 or higher oversubscription with 10GE at the access layer.
Why would I want to buy a 10GE switch today that has no support for DCB or FCoE? Whether you like it or not FCoE is here, both Cisco and HP are backing it strongly with products shipping and more on the way. Emulex and Qlogic are both in their second generation of Converged Network Adapters (CNA) see my take on Emulex’s known as the OneConnect adapter (http://www.definethecloud.net/?p=382.) The standards are all ratified and even TRILL is soon to be ratified to provide that beautiful Spanning-Tree free utopian network you’ve dreamed of since childhood. If I’m an all NFS or iSCSI shop maybe this doesn’t bother me but if I’m running Fibre Channel there is no way I’m locking myself into 10GE at the access layer without IEEE standard DCB and FCoE capabilities in the hardware.
What it really comes down to is that this switch is meaningless in the average enterprise data center. Where this switch fits and has purpose is in specialized multi-rack appliances and clusters. If you buy an Oracle multi-rack system or cluster from Oracle this will be one option for connectivity. With any luck they won’t force you into this switch because there are better options.
Thanks to my colleague for helping me out with some of this info.
Kudos: I do want to give Oracle kudos on the QSPF which is the heart of how they were able to put 72 10GE ports in a 1RU design. The QSPF is a 40GE port that can optionally be split into 4 individual 10GE links. It’s definitely a very cool concept and will hopefully see greater industry adoption.
How to build the 10GE network of your dreams:
One of the things I love about the Oracle 10GE switch is that it highlights exactly what Cisco is working to fix in data center networking with the Nexus 5000 and 2000.
Note: Full disclosure and all that jazz, I work for a Cisco reseller and as part of my role I work closely with Cisco Nexus products. That being said I chose the role I’m in (and the role chose me) because I’m a big fan and endorser of those products not the other way around. To put it simply, I love the Nexus product line because I love the Nexus product line, I just so happen to be lucky enough to have a job doing what I love.
So now stepping off my soapbox and out of disclosure mode let’s get to the what the hell is Joe talking about portion of this post.
In the diagram above I’m showing two Nexus 5020’s in green at the top and 10 pairs of Nexus 2232’s connected to them. What this creates is a redundant 320 port 10GE fabric with 2 points of management because the Nexus 2000 is just a remote line card of the Nexus 5000. All of this comes with two other great features: latency under 5us and FCoE support. Additionally this puts a 2K at the top of each rack allowing ToR cabling while keeping all management and administration at the 5K in the middle-of-row. Because the system also supports Twinax cabling there is a cost savings of thousands of dollars per rack over Fibre cabling to TOR or EoR. There is not another solution on the market that comes close to this today. All of this at a 4:1 oversubscription rate at the access layer. If you’re willing to oversubscribe a little more you could actually add 2 more redundant Nexus 2000s for another 64 ports capping at 384 ports.
This entire solution comes in at or below the price of 2 of Oracle’s switches before considering the cost savings on cabling.
I don’t believe Oracle’s 72 port switch has a market in the average data center. It will have specialized use cases, and it is quite an interesting play. The best thing it has to offer is the QSPF which hopefully will gain some buzz and vendor support thanks to Oracle.