10 Gigabit Ethernet technology: Building high-performance networksDate: Apr 21, 2011
10 Gigabit Ethernet technology allows data center networks to handle storage and data convergence and the extreme traffic flow among virtual environments. But with those benefits come some challenges, including the need for new network architectures and traffic management strategies.
In this deep-dive webcast, founder and chief scientist of DeepStorage.net Howard Marks discusses upgrading to 10 Gigabit Ethernet technology in order to enable consolidation and virtualization. Marks explains how the transition to 10 GB is different than the transition from 10 to 100 MB or 100 MB to 1 GB. For those considering that migration, Marks explains choices in basic cabling for 10 GbE networks, as well as the various choices in 10 GbE switches. Going deeper, Marks explains the demand that virtual servers place on the network and how 10 GbE technology and new kinds of I/O management can help address the issue. Specifically, Marks discusses various types of network interface cards (NICs) and consolidated network adapters (CNAs), as well as the importance of virtualized I/O in segmenting and managing multiple streams of traffic coming out of a virtualized environment.
Marks also discusses the role of converged storage and data center networks in high-performance virtualized and cloud environments. In moving toward network convergence, Marks considers whether to implement FibreChannel over Ethernet (FCoE) or stick with iSCSI. He also talks about the role of adapters and virtualized I/O in converged data center networks. Finally, Marks addresses new types of data center network architectures— including the move away from spanning tree and the use of data center bridging and Transparent Interconnection of Lots of Links (TRILL).
About the expert: Howard Marks is the founder and chief scientist at DeepStorage.net, a networking consultancy. In over 25 years of consulting, he has designed and implemented networks, management systems and Internet strategies at organizations including American Express, JP Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide and Foxwoods Resort Casino. Marks is a frequent speaker at industry conferences and the author of more than 100 articles in technical publications. He has also written the book Networking Windows and co-authored Windows NT Unleashed.
Read the full transcript from this video below:
10 Gigabit Ethernet technology: Building high-performance networks
Kara Gattine: Hello and welcome to our advanced workshop
Gigabit Ethernet and Beyond. My name is Kara Gattine and it's great
to have you with us.
Joining me today is our workshop expert, Howard Marks, founder and chief
scientist of Deepstorage.net. With 25 years of consulting experience,
Howard has designed and implemented networks, management systems and
internet strategies at organizations including American Express, J.P.
Morgan, Borden Foods, U.S. Tobacco, [B.B. Deo] Worldwide and Foxwood
Resorts Casino. Thanks so much for joining us today, Howard.
Howard Marks: It's my pleasure, Cara. Always nice to help bring people up to
speed on new technologies.
Kara Gattine: Okay. Now Howard is going to chat with us about 10 gigabit
Ethernet upgrading to enable consolidation and virtualization. Howard,
take it away.
Howard Marks: As much as it might appear that 10 gigabit Ethernet is just
another generation of that old friend technology, Ethernet, the
transition to 10 gig is different than the transition from 10 to 100 meg
or 100 meg to 1 gig. Today, the problem in the data center is not just
that there's insufficient network traffic or bandwidth available, but
that we have different kinds of traffic and we need to keep them
organized better. Ten gig is a solution to that and we'll talk about
why 10 gig Ethernet can be the solution to your I/O problems, including
both data and storage problems as we can converge those now disparate
We'll talk about how the problem is more than speed and new features
that are being implemented in today's 10 gig Ethernet products can
address that. We'll talk about the server end of the network, network
interface cards and consolidated network adaptors. And then we'll end
today talking about things you should be considering in building your
new data center network.
The basic problem today is as we implement technologies, especially
server virtualization, the I/O demands of individual physical servers
have become not only substantially greater, but substantially more
complex than they've been in the past. Today, on a typical virtual
server host, there is a network for users to access the server, a
network for the server to access its storage. Networks for management
functions and for moving virtual servers from location to location via
vMotion or live migration. And then there are probably some additional
networks for either sensitive servers or servers that are publicly
accessible to so-called demilitarized zone.
If we implement those five networks and we use two connections for each
because, of course, things go wrong and we need some redundancy, that's
8 to 10 cables coming out of the back of each virtual server host. And
you end up with something that, at best, looks like the picture in the
lower right-hand corner of the slide, and more frequently looks
substantially less organized than that.
Not only is this a problem when we need to do server maintenance or
implement something new, but that morass of cables interferes with
airflow. And just keeping things running becomes a problem. Even if
that was okay, we still only have 2 gigabits per second for each
function. And for user traffic, that can be a problem. Some experts
recommend that virtual server hosts have 8 gigabits per second per core
of bandwidth available for user access. With today's 16 and 24 core
servers, two 1 gig connections clearly isn't going to cut it.
Finally, providing that many connections means that we can't use one-use
those servers and we have problems with blade servers. Because they
simply don't have enough I/O slots to put 8 or 10 connections. 10 gig
addresses these problems and 10 gig is ripe today. Frankly, 10 gig
technology has been available for 4 or 5 years but not at a cost that
the average organization could really implement.
10 gig Ethernet today has reached the magic number of around $1,000 a
port. And that's for all the implementation costs. The network
interface card, the port on a top of rack switch, the cables. It's
about what four 1 gig ports cost. It's about what 100 megabit per
second ports cost when we made the transition from 10 to 100 megabits
per second. It's about what 1 gigabit per second ports cost when we
made the transition from 100 megabits per second to 1 gigabit. And so,
we've now reached the point where 10 gig is cost effective.
We also now have technologies that allow us to virtualize our interface
cards. And have a single physical 10 gig connection emulate multiple
network interface card. So, we can segregate the management, the
vMotion, the user access and the storage traffic into their own virtual
channels even though they're all traveling across the same cable.
A typical server can now have two cables for redundancy, over which we can
converge storage traffic via iSCSI or NFS or even fibre channel storage
traffic using the new protocol fibre channel over Ethernet. Today's 10
gig switches have very low latencies. On the order typically somewhere
between 700 nanoseconds and 2 microseconds which is nothing compared to
what the 1 gig switches present. And so, even applications that we
previously would have insisted run in the same machine or use high-speed
interconnects like InfiniBand, can move to 10 gig.
And just in the past months, vendors like Force10 have started
announcing that their top of rack 10 gig switches have 40 gigabit per
second uplinks to the data center core so that we won't bottleneck on
the top of rack to core connection.
Now, don't get me wrong. 10 gigabits per second, per say, isn't the
panacea. Yes, we now have 10 times as much bandwidth but we still need
isolation. Testing in the lab has shown that on today's [Naamahillman] or
[Westmere] based servers running vMotion and migrating one virtual server
from a virtual server host to another can generate as much as 8 gigabits
of bandwidth per second. Which could flood that 10 gig connection and
make user access or storage traffic have to wait. And really, storage
traffic doesn't like to wait. Bad things happen.
We always have the problem that infected user machines could create
distributed denial of service attacks. And just from the sanity point
of view, when you install the VMware or HyperV and you say that my
storage traffic and my user traffic and my management traffic are all
going to take this one network card, pop-ups come up and say, "That's
not really recommended. You should have separate connections."
And so, the solution to that is not to have separate physical
connections, but to have separate virtual connections. And so, the
image on this slide comes from HP, from their FlexFabric technology.
But, similar technology is available from multiple vendors. With
virtual NICS, a single physical card talking to a single physical port
can emulate multiple network interface cards.
So, the hypervisor or host operating system sees drivers for multiple
network cards. One network card to take vMotion. Another network card
to take user traffic. Another one can take storage traffic. As the
data comes out the physical NIC, each of those virtual NICs is assigned
to a separate VLAN and the traffic is tagged with the VLAN information.
And so, now we've taken one physical connection and made it 4 or 16 or
300, for some vendors, virtual cards. Then we can apply bandwidth caps
and say that this virtual NIC gets 2 gigabits per second of the
available 10 gigabits per second. Or we can apply quality of service
metrics to the virtual NICs so that this network card can use up to 2
gigabits per second of bandwidth. The difference is that with bandwidth
caps, that 2 gigs per second to that channel is allocated to that
channel whether it's using it or not. And that's the way HP and Emulex
do things today.
Cisco in their UCS solutions is a little bit more sophisticated and they
use QoS. So, if that channel isn't using the 2 gigabits per second that
you've assigned to it via QoS, that bandwidth is available for other
channels to use. This technology is available from NIC vendors in the
PCIE market. It's more mature in the blade market because, frankly, the
network interface card and the 10 gigabit switch have to cooperate to
manage the bandwidth caps for quality of service since in a blade
environment, the switch in the back of the blade center and the NIC on
each blade come from the same vendor. Or from vendors that work very
closely together. That coordination is simpler. But, this technology
will be much more widely available very soon.
The next issue, and the one that's driving a lot of people to 10 gig, is
converging the storage and data networks. Today, most organizations
have two separate networks. They have an Ethernet network for data and
a fibre channel network for storage. And they have two independent
teams of people that manage those networks. This is, of course,
inefficient. It means that we need to have a lot of cables. And with a
pair of 10 gigabit connections to the server for redundancy, there's
plenty of bandwidth for both. Reducing 8 or 10 cables to 2 is going to
reduce costs. Both in terms of capital expenditure, because we need to
buy fewer switch ports, use less power to run those switches. And
operating expense, because we have fewer objects to manage.
So, our options for storage are, first of all, to run IP based storage
which we've always done, over Ethernet. NFS is making a comeback
because people have discovered that it's easier to manage a large number
of virtual server disks in an NFS file share, than it is to create a
large number of logical drives or LUNs in a SAN environment. Even if
you want to continue to run the SAN environment and do block as opposed
to file I/O, iSCSI, which has been very successful historically in the
mid-market running over a 10 gig connection, especially a lossless 10
gig connection, can provide performance comparable to the fibre channel
equipment you bought two or three years ago.
For larger organizations that have established fibre channel
environments and want to leverage the knowledge they've built managing
those fibre channel environments and the tools and methods that they use
for managing those fibre channel environments, the fibre channel
industry has developed a new standard protocol, fibre channel over
Ethernet, that imbeds the fibre channel protocol in 10 gigabit per
second Ethernet at the physical layer.
It requires lossless data center bridging, which some vendors have
called converged enhanced Ethernet or data center Ethernet. And one of
the attributes of lossless DCB is a function called per-priority PAUSE.
And fibre channel traffic will typically have the highest priority. And
that combination of DCB and fibre channel over Ethernet means that fibre
channel traffic gets the lossless low latency network that fibre channel
applications assume is available.
In terms of the standards process, the FCoE standards are approved and
ratified. The DCB standards, and this slide says, "Still under
discussion," but the truth is the i's are dotted. The t's are crossed.
It's at the printer waiting for the final rubber stamp of approval.
We're at the stage now where the DCB standards are finalized to the
point that vendors can implement products even though they haven't been
officially ratified. FCoE alone provides some isolation for storage
traffic but it's not really enough to manage the whole data center.
Now I've used the term "lossless." And let me be really clear. When I
say lossless, I mean lossless because of congestion. Traditionally,
Ethernet, when a port or a link gets congested, simply discards packets
and relies on an upper layer protocol, typically TCP, to re-transmit
packets that haven't made it to their end destination. By comparison,
fibre channel networks use a hop-by-hop buffer credit mechanism to make
sure that data isn't sent on the network unless you're sure the network
can handle it. Hop-by-hop buffer credits are more complicated but TCP
relies on time-outs to determine that a packet hasn't made it to its
destination. That adds latency. It also throttles back when packets
are lost which reduces the amount of bandwidth that's available that can
be really used.
And so, we've developed data center bridging. These are extensions to
the Ethernet standards that make Ethernet lossless. And, of course, by
lossless I mean lossless due to congestion. There are basically four
standards per-priority PAUSE that divide an Ethernet channel into 8
priority slots. And allows a switch or device to tell its partner,
"Stop sending all but priority one traffic because I'm running out of
buffer memory." Enhanced transmission selection, which allows devices
to negotiate the priority groups and priorities of protocols. And then
a pair of extensions bridging exchange and congestion notification that
extends this flow control process from hop-to-hop to ends-to-end.
Data center bridging is one of the technologies that is starting to
separate data center network technology from campus Ethernet technology.
And so, we're starting to see that switches with similar port counts
and similar prices are now being targeted either to the data center or
to the campus. A trend we haven't really seen in the past.
Beyond DCB, fibre channel switches are special. And today, there are a
limited number of fibre channel switches available. Cisco, Brocade, HP
and IBM in their blade chassis have fibre channel switches. Because
fibre channel as opposed to Ethernet is a smart network technology,
fibre channel puts functions like naming servers and authentication in
the switches. Where Ethernet networks typically implement that kind of
thing in servers distributed around the network.
You can't have a fibre channel over Ethernet direct connection from a
server to a disk array. You have to have a fibre channel switch. It
has to have a function called the fibre channel forwarder. And fibre
channel forwarders amongst fibre channel over Ethernet switches
implement fibre channel-like hop-by-hop congestion control via the PAUSE
and per-priority PAUSE mechanisms. So today, you can implement
multi-hop FCoE if every switch in the patch between your server and your
storage array has this fibre channel forwarder function. You can also
use switches that implement data center bridging between your servers
and that FCoE switch. Essentially using them for fan out.
One of the more interesting market developments over the next couple of
years is going to be what happens in the server to implement 10 gigabit.
Especially as we start talking about the server vendors implementing 10
gigabit as part of the built-in network interface card or LAN
Today, we have basically two vendor camps. We have those vendors that
have traditionally made network interface cards that are designed for
data. And they can do storage either by using software initiators like
Microsoft's iSCSI initiator that many of us have been using for years.
Or by implementing some of that technology in the processor on the card
and off-loading that protocol to the card.
The other camp, basically the vendors that have made fibre channel HBAs,
argue that the advantage of fibre channel over Ethernet is that it looks
like fibre channel. And it should look like fibre channel all the way
into the server. So that the converged network adaptor converges a
fibre channel host adaptor and an Ethernet network card and looks to the
hypervisor or the host operating system like those two devices magically
welded together. And so, QLogic, Emulex and Brocade make CNAs. They're
somewhat more expensive. Typically about twice the cost of a network
interface card. But they run the same fibre channel drivers that the
fibre channel host adaptors have used.
And people who run large fibre channel networks typically manage their
host adaptors. While people who run large Ethernet implementations
typically don't manage their NICs. So, having CNAs that use the same
management model is advantageous for those folks.
The other decision that you're going to face as you start implementing
10 gigabit is what kind of cable to use. Traditionally, Ethernet
technologies have taken off when we've made it to BASE-T. So, with 100
megabit per second Ethernet or gigabit Ethernet, there was a limited
deployment over fibre. But, as the technology evolved to run that speed
over twisted pair cable, that's when the technology took off.
10 gig is a little different because we have a new technology, the SFP+
plus direct attach cable that you'll see in the left-hand column. This
cable plugs into the sockets that are on network cards and switches.
Where for fibre, you would be plugging the optical module. And so, we
can have the same switch or NIC for short runs, up to 7 to 15 meters,
depending on what cable you're buying, with direct attach cables. And
then, on other ports we can plug in optics and use fibre optic cables.
The advantage today of direct attach is it's much less expensive. About
$100 a port as opposed to about $1000 a port for optical. And when
compared to 10 gig BASE-T, it uses substantially less power. Each port
today for a 10 gig BASE-T switch or NIC uses somewhere between 3 and 5
watts. So, each connection is 10 watts. Connect hundreds of servers
with 2 connections each and that power difference between 5 watts for 10
gig BASE-T and 2 watts for direct connect starts to add up and more than
accommodate the difference between a $2 patch cable and a $100 patch
I'm generally recommending that my clients use SFP+ direct attach in a
top of rack model, where servers connect to a switch in the top of each
rack. And, of course, the top of rack model cables are never going to
be more than a few meters long anyway. So, the 7 meter limitation for
direct connect doesn't matter. And then use fibre for the connections
from the top of row switch to the core.
Projections I've seen from vendors like Intel expect 10 gig to take off
in 2012 and 2013 when lower power devices become available. But, even
then, those projections show about a 50/50 split between direct connect
and 10 gig BASE-T.
The other question is do you want to use a top of rack model where you
have 24 or 48 port switches in every rack or every other rack. Or an
end of row model with big, hulking 100 plus switch ports at the end of
each row. End of row means fewer devices to manage but longer cable
runs. And right now, the top of rack model's winning and I expect that
to continue. I think top of rack is a model that scales in smaller
increments and larger sizes than end of row does.
So, what should I be looking at today? Well, first of all I'd be
looking at switches for my servers that implement DCB. DCB can speed
not just fibre channel over Ethernet traffic, but because it smooths out
congestion, can speed up iSCSI, NFS and even user access traffic. You
could use that to front-end an FCoE switch if you decide you're going to
go to FCoE later. Or the top of rack FCoE switches from Cisco and
Brocade already implement DCB. And FCoE is a software add-on for them.
Because the DCB standards are just now reaching finality, vendors have
been building the ability to do DCB into their hardware and announcing
that a firmware upgrade may be coming later. I wouldn't really worry
about that very much. One trend I would also be looking at is stacking
or virtual switching. Where multiple switches connect to be one device
to manage. This does lock you into a particular vendor but it means
that all of the switches are one spanning tree bridge which maximizes
your uplink bandwidth.
Additional features are coming. TRILL and other layer to mesh
technologies will eliminate spanning tree within the data center. Virtual machine aware security will eliminate the use of port profiles. It'll allow you to say that this virtual machine, even though it moves from virtual server
host to virtual server host, should get this security model.
Now is the time to start experimenting at the of edge your network.
Although this technology that is data center networking technology is in
a lot of flux, we're going to see more changes here in the next 18
months than we've seen in the past 10 years. And so, I'd be doing lab
and proof of concept testing. But, I don't think that today is yet the
time to make a big investment and say, "This is the technology we're
going to be using for the next 10 or 15 years." Unless you're the kind
of company that selects their vendor and then selects their technology.
That's about all I have to say today. Thanks for listening. It's been
my pleasure to pontificate for you.
Kara Gattine: Great presentation, Howard. Thank you. Thanks for tuning in
today and we hope you enjoyed this presentation. Be sure to also check
out Howard's introductory video and article that accompanies this
advanced workshop. Then take the quiz. And if you answer the question
successfully, we'll email you a certificate of completion for attending
all the components in our advanced workshop on 10 gigabit Ethernet.
Thanks and have a great day.