Open vSwitch 1.4 installation from package on Ubuntu 12.04

In trying to get a more grounded feeling for OpenStack I’ve decided to build a home lab. One step involves configuring Open vSwitch to bridge with VMs. In this post I shall cover the Open vSwitch (OVS) build process along with KVM installation. Future posts shall cover more detailed configurations and scenarios along with videos.

While I am more familiar with the CentOS/RHE flavors of Linux, there seems to be more support for OVS on the Debian/Ubuntu platform. So in this post I am covering Ubuntu 12.04 LTS. There are two ways to install OVS:

  • Use Ubuntu’s apt-get installer to install packages – easier
  • Build from source code – more difficult

This post is aiming at the low-hanging fruit of building from the package. The drawback is that newer features are unavailable in the package. The package version of OVS is 1.4.0. The most stable Long Term release, as of writing, is 1.4.3, while the latest release, 1.7.1, includes support for VXLAN and Open Flow. I plan to document my findings with various builds and Linux flavors in future posts.

As I mentioned, I built OVS 1.4.0 off of Ubuntu 12.04 LTS (Long Term Support), which runs kernel version 3.2. The following steps are taken from various documents on the OVS site, while the outputs are excerpts from my lab.

root@pakdude-02:~# uname -a
Linux pakdude-02 3.2.0-34-generic #53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
root@pakdude-02:~# apt-get install build-essential fakeroot openvswitch-switch openvswitch-common openvswitch-datapath-source

Keep in mind that additional packages, such as dkms (Dynamic Kernel Module Support), were installed as a result because they were pre-requisites.
The following output is good:

DKMS: build completed.

openvswitch_mod:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.2.0-34-generic/updates/dkms/

brcompat_mod.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.2.0-34-generic/updates/dkms/

depmod....

DKMS: install completed.
Setting up openvswitch-switch (1.4.0-1ubuntu1.3) ...
 * Inserting openvswitch module
 * /etc/openvswitch/conf.db does not exist
 * Creating empty database /etc/openvswitch/conf.db
 * Starting ovsdb-server
 * Configuring Open vSwitch system IDs
 * Starting ovs-vswitchd
 * Enabling gre with iptables

OVS has now been built. We will verify shortly. But first, we need to install KVM, a full-blown virtualization solution for Linux, and libvirt-bin, a daemon that loads the KVM modules. KVM also inclue virsh, which is a tool to manage (create, start, stop, etc) virtual domains or networks. Remember, KVM requires libvirt-bin.

root@pakdude-02:~# apt-get install libvirt-bin

Note that this will install bridge-utils and ebtables as well. We will get to that shortly. First, we want to destroy the default network created by libvirt-bin, which is virbr0. OVS will supply the network instead.

root@pakdude-02:~# ifconfig virbr0
virbr0    Link encap:Ethernet  HWaddr 4e:c0:0d:41:e3:0c  
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

root@pakdude-02:~# virsh net-destroy default
Network default destroyed

root@pakdude-02:~# virsh net-autostart --disable default
Network default unmarked as autostarted

root@pakdude-02:~# ifconfig virbr0
virbr0: error fetching interface information: Device not found

Now we have to actually install KVM.

root@pakdude-02:~# apt-get install kvm

Some additional packages are installed in the process.
Keep in mind that ebtables is not needed, so remove it. OVS will play the role of the bridge.

root@pakdude-02:~# apt-get purge ebtables

bridge still showed up in lsmod | grep bridge, but there was no need to rmmod it (as shown in many other guides on the web) as it was gone upon the next reboot. Remember, OVS will assume the bridging functionality. Some guides mention Bridge Compatibility installation. However, I do not see the need. Bridge Compatibility provides a way for applications that use the Linux bridge to gradually migrate to OVS. Programs that ordinarily control the Linux bridge module, such as brctl, instead control the OVS kernel-based switch. If you do not already depend on these programs, then you do not need bridge compatibility.

root@pakdude-02:~# service openvswitch-switch status
ovsdb-server is running with pid 1104
ovs-vswitchd is running with pid 1125
root@pakdude-02:~# ovs-vsctl show
ab15a0d5-7c66-4388-b921-5d4397a7608b
    ovs_version: "1.4.0+build0"

We’re good to go. Additionally, these are the relevent processes that are now running:

root@pakdude-02:~# ps -face | grep ovs
root      1103     1 TS   29 23:45 ?        00:00:00 ovsdb-server: monitoring pid 1104 (healthy)                                                                                                                                                                                                                                                                                                                                                                       
root      1104  1103 TS   29 23:45 ?        00:00:00 ovsdb-server /etc/openvswitch/conf.db -vANY:CONSOLE:EMER -vANY:SYSLOG:ERR -vANY:FILE:INFO --remote=punix:/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,manager_options --private-key=db:SSL,private_key --certificate=db:SSL,certificate --bootstrap-ca-cert=db:SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
root      1124     1 TS   29 23:45 ?        00:00:00 ovs-vswitchd: monitoring pid 1125 (healthy)                                                                                                                                                                                                 
root      1125  1124 TS   29 23:45 ?        00:00:00 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vANY:CONSOLE:EMER -vANY:SYSLOG:ERR -vANY:FILE:INFO --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor
root      2346  2183 TS   19 23:57 pts/1    00:00:00 grep --color=auto ovs
root@pakdude-02:~#

And that’s about it. Hopefully I’ll get some functionality and configurations up here soon.

Advertisement

Challenges that Plexxi Faces

This week, the Lean Startup conference was held in San Francisco. The Lean Startup philosophy borrows its roots from the lean management method of manufacturing, with Kanban or Just-in-time processing at the center of the design principle. It basically advocates startups to minimize outside funding, not strive for the perfect product (think Minimal Viable Product), be flexible (think Pivot), and cater the product completely to the customer’s needs, all with the goal of being a highly efficient company.

However, as noted investor Marc Andreessen, who spoke at the conference, warns,

Not all startups can be Lean Startups

Indeed, some startups cannot afford to employ a Pivot. Infrastructure or hardware companies come to mind, especially when they’ve already taken $50 million of investment. This is what Plexxi has done without a complete solution to show for yet.

I was listening to the recent Packet Pushers show #126 sponsored by Plexxi. While their approach is a creative one, I’m not sure whether it is viable. In a nutshell, Plexxi brings optical technology, in the form of WDM, to the Data Center and flattens traditional hierarchical network designs. When I first started learning about network designs, the classical approach was the 3-tier Core-Distribution-Access model. In the mid-2000s this got reduced to a Collapsed Core. What Plexxi proposes is a flat topology, eliminating the need for Core switches in the Data Center.

Plexxi adopted the SDN approach of a programmable controller (a virtual appliance) that pushes policies to its switches. The policies are intended to optimize data path flow for affinitized traffic. Applications that are more sensitive of certain resources are classified in Affinity Networks. Some example of the constraints or sensitivities that Plexxi’s Director of Product Management, Marten Terpstra, described include:

  • Hop-count
  • Bandwidth

Plexxi switches use merchant silicon (Broadcom ASICs) to form an Ethernet ring on top of a WDM lightwave. By changing lambdas, Layer-1 connections between switches can be changed according to the application requirements.

Plexxi uses their own closed APIs for communication between their switch interfaces and their controller, in order to convey their message of affinities. However, they open up their proprietary northbound API for user-to-controller communication so users can write scripts, for example, by using REST APIs. Interestingly, they are a member of Open Network Foundation. The Controller places TCAM entries in switches based on application requirements for affinitized traffic.

Terpstra discussed two use cases:

  1. Affinitized iSCSI traffic for most bandwidth with least number of hops
  2. Cloud provider – Use a Plexxi ring as a premium service to affinitize traffic.

In neither case are the results mentioned.

Okay, so so far Plexxi’s solution is a 1 RU box that can prioritize traffic based on hop-count and bandwidth. I fail to see much of a business case there. Any network engineer worth his or her salt will tell you that there is more to traffic classification and prioritization than just hop-count and bandwidth. Financial trading institutions would be more concerned about latency guarantees. Hop Count alone is a flimsy criterion to classify important traffic, regardless of whether a cute term like Affinity Network is given to that classification. High Availability is a critical issue that a ring topology exacerbates. As Doug Gourlay of Arista mentions, unnecessary downtime is introduced any time you add new nodes because the ring is broken. Moreover, the network is reduced to a split brain model in the even of just two nodes going down. Depending on the Controller placement, this could have adverse outcomes. The thing about outages is that we can never control where they occur. Gourlay rightly puts it:

I thought Token Ring died for good reasons… why is someone trying to bring it back?

Getting back to the Lean Startup idea, Terpstra said “Our Layer-3 affinities are coming”. Plexxi is targeting Christmas 2012 for 1.0 version of Layer-3 capabilities. Until then Plexxi only has a Layer-2 switch with no quotable value to show for $50 million in investment. Not a good time to Pivot.

Reports of the death of the Core switch in the Data Center have been greatly exaggerated.

Cisco UCS Manager – Orchestration Done Right

Earlier this month Cisco announced General Availability of UCS Central – a manager of managers. Given Cisco’s string of failures when venturing outside its comfort zone (think Flip and Cius), few expected anything different when Cisco entered the servers market in 2009. But instead it has been a resounding success. UCS has become the fastest growing business in Cisco history with over fifteen thousand customers and over a $1 billion annual revenue. It is already the #2 blade server market share leader in North America. Why has it done so well?

I believe one of the main reasons is that its management software, UCS Manager, delivers one thing that SDN also promises – High Level Orchestration. Before UCS Manager, administrators, such as Chris Atkinson, had to write their own scripts to configure and maintain BIOS configs, RAID configs, and firmware configs. With UCS Manager, this information is kept in a Service Profile. When a blade dies, the blade does not need to be removed and ports do not need to be updated. Just move the Service Profile to a new blade via software. Need to repurpose a physical server for a different department? Just create a new Service Profile and reassign the blade server to that Service Profile without recabling or moving metal, all within a matter of seconds. You would expect such agility with moving virtual machines, not physical machines!

In one of my past lives I used to work very closely with CiscoWorks. UCS Manager does a far better job at managing servers than CiscoWorks did at managing routers and switches. CiscoWorks was rigid and far too dependent on CLI (Telnet/SSH) and SNMP MIB modules for accessing and managing devices. UCS Manager, with its single-wire management and XML API, is more flexible and integrates with third party systems management tools, which allows for more agile deployments. My understanding is that the same APIs are opened up in UCS Central. Let’s see whether it can be as successful as UCS Manager.

Death to the CLI

One of the selling points of Cisco’s Nexus 1000V virtual switch is that it provides network administrators with a familiar way of configuration, namely the CLI. The Nexus 1000V is built on NX-OS and is accessible via SSH. This was intended to give network engineers the familiar look and feel to perform their duties that they didn’t have with the hypervisor’s native vSwitch.

I understand the need for separation of duty and that is what any dedicated management interface of switch provides. And I appreciate that the Nexus 1000V offers many rich features that most soft switches don’t, such as L2/L3 ACLs, link aggregation, port monitoring, and some of the more advanced STP knobs like BPDU Guard. What I don’t appreciate is clinging on to an archaic mode of configuration.

When I took my CCIE lab, Cisco provided a single reference CD-ROM, known as UniverCD or DocCD. Many tasks required knowledge of esoteric commands. One of the first steps any competent test-taker would take would be to use the alias command to define short cuts. For example, show ip route might become sir. Network engineers often take great pride in the aliases they define and the Expect/Perl/Python scripts they’ve written to automate tasks. They rave about the amount of time saved. Of course all of this would break when new CLI commands were created by the vendor that conflicted with existing aliases.

In one of my past roles I was one of five engineers who used to frequently make firewall rule changes to ASAs. All of us were CCIEs, but none of us used the CLI to make the changes. Instead we preferred to use ASDM, the GUI element manager. Sure it was buggy and handled concurrent changes poorly, but at least the changes made were accurate. Adding a single rule isn’t as simple as adding a single line. In most cases you have to edit object groups and make sure there are no overlapping or conflicting rules. Trusting a human to do this accurately every time is like trusting someone to have a 5-hour daily drive for work and never get into an accident.

There is a smarter way to do configuration management. Make the network programmable. Offer APIs to developers that are stateful and intelligent. Obviously, the rebuttal from Nexus 1000V loyalists is that engineers are familiar with NX-OS and would therefore be more comfortable with the CLI. But that’s a step in the wrong direction. When I look back at how much time gets wasted by network engineers in creating simple automation tasks such as macros, I realize this is one of the reasons networking has lagged behind compute technologies. Network engineers should not have to write their own scripts to make their own lives easier. Applications should be doing this for them. Let the network engineers focus on their job, which is optimizing how packets need to get sent from source to destination – as quickly, reliably, and securely as possible.

SDN – What’s in a Name? Part 3

This is the third part of my series of posts on trends of vendors to latch on to the SDN bandwagon. For more information, refer to parts 1 and 2. In this post I discuss how a few of the services vendors have responded to the buzz around Software Defined Networking.

WAN Optimization

Riverbed claimed it was riding the SDN wave at VMworld 2012 when it announced its latest release of Cascade. However, they went by the second definition I used in Part 1, which says that SDN decouples physical and virtual networks or overlays (not Control Plane and Data Plane as the other definition emphasizes). The difference is subtle. Riverbed partnered with VMware to develop the IPFIX record format that can provide VXLAN tenant traffic information as well as VLAN tunnel endpoint information. Thus, they claim, Cascade is SDN-ready because it is VXLAN-aware.

Silver Peak also pumped its chest at VMworld 2012 with its Agility announcement, which is a plug-in for VMware vCenter. Agility allows administrators to enable acceleration between workloads within vCenter using the vSphere interface that server administrators are familiar with. This requires Silver Peak’s Virtual Appliance to already exist within vCenter. Almost three months after the announcement, details are extremely thin. Silver Peak has been drooling about Nicira ever since the SDN champion was acquired by VMware. Indeed, all you have to do is Google Nicira and Silver Peak and observe all the enthusiasm that Silver Peak has shown for Nicira. But the feelings are not mutual. Silver Peak claims it is working with Nicira and leveraging Nicira’s Open vSwitch, but Nicira/VMware have made no such announcements. In fact, there are no further details about this relationship on Silver Peak’s own website.

Exinda is so far behind on the SDN learning curve that the only mention of SDN on its website is a hyperlink to the SDN Wikipedia page in a blog post written by the VP of Marketing that reported his observations from Interop 2012. Clearly Exinda has a long way to go before its SDN strategy can be taken seriously.

Load Balancers

As of VMworld 2012F5‘s Big-IP products can support native VXLAN functionality and will be have VXLAN virtual tunneling endpoint capabilities in the first half of 2013. What that exactly means is vague at this time. The press statement I linked to is the only mention of Big-IP’s current SDN capabilities. My guess is that they’ve opened up some APIs to VMware to allow programmability.

Embrane uses an under-the-cloud approach of offering cloud providers a platform that delivers elastic Layer 4-7 network services to their customers. The services include Load Balancer, Firewall, and VPN. Embrane’s heleos architecture is a radical solution that comprises the Elastic Services Manager (a provisioning tool) and the Distributed Virtual Appliance, the latter being a logical network services container instantiated across a pool of general-purpose x86 servers. The issue likely to raise eyebrows is that each service that is part of their platform is a wrapper around an open source distribution. I haven’t heard of too many providers willing to vouch for ipchains as a Firewall.

Firewalls

Palo Alto Networks earned its stripes by making its firewall appliances with merchant silicon. To  stay ahead in the SDN era, it announced a technology partnership with Citrix in October 2012, but has not yet released a product offering.

Big Switch Networks, a leading SDN player, announced, on November 13, 2012, its ecosystem of Technology Alliance Partners that included Palo Alto Networks. However, Palo Alto Networks has not mentioned this on their website, which is odd given that partnerships are what have made Palo Alto the hugely successful company it is today. One would expect them to be on top of it.

So there are no SDN-friendly firewalls currently on the market other than Cisco’s Nexus 1000V portfolio, which includes VSG and ASA 1000V Cloud firewall.

Summary

It appears, from these observations, as though partnerships are key to ousting incumbents in the SDN world. Much of SDN support at this time is just hype, but sellable products will come out soon. The industry also needs the open source movement to challenge the VMware-centric ecosystem to enable a higher level of interoperability and allow for more flexible orchestration and programmability. OpenStack is that approach. More to follow in future posts.

ipSpace.net Blog Posts

Musings on How to Build Modern Applications in the Cloud and Onprem