Category Archives: Startups

Santa Cruz New Tech Meetup

Since moving to Santa Cruz, I’ve attended two meetups for Santa Cruz New Tech Meetup, which is the 8th largest meetup in the United States. The events are held on the first Wednesday of each month and feature pitches from some of the local tech entrepreneurs in the city. While Santa Cruz isn’t technically Silicon Valley (it is on the other side of the hill), it is considered a part of the San Francisco Bay Area and is host to some talented entrepreneurs. However, there aren’t (m)any startups in Santa Cruz looking into the SDN or Cloud space. In this post, I outline the companies that presented to the audience of over 200 people at the November 2014 Santa Cruz New Tech Meetup .

Eggcyte has a small handheld product called The Egg, which is basically a webserver that stores media that can then be shared selectively. It is intended to provide a level of privacy that social media outlets can’t offer, because the cloud is essentially the Egg. With 128 GB storage and 10-12 hours of battery life, the founders are intending to provide a more tangible ownership experience of media. It has a long way to go though, and needs to better address security (screen scraping, encryption, etc) in order to gain traction.

Moxtra has its roots in WebEx. One of the co-founders was the founder and CEO of WebEx before it was acquired by Cisco. Moxtra is a cloud collaboration platform that encompasses multimedia, such as text, voice and multimedia chat capabilities, visual and verbal content annotations, mobile screen sharing, and task management.

Tuul is currently arguably the hottest startup in Santa Cruz and is focused on improving the customer experience. In their words, Enhanced by our patent-pending tuulBots, Tuul’s customer support automation solution provides a platform for businesses to interact with their customers in a more direct, simple, and efficient way. tuulView dashboards enable business to handle multiple requests simul­taneously, with little integration required.

City Blooms has taken a plunge in to the Internet of Things, or as they call it, Internet of Farms. As they say, Cityblooms creates modular micro-farms that grow fresh and healthy food on rooftops, parking lots, patios, parks, and everywhere in between. They have a prototype installed on the Plantronics (another Santa Cruz company). This was a very impressive solution that I hope succeeds.

Finally, PredPol (short for Predictive Policing for law enforcement) uses analytics based on historical data to help reduce crime. It reminds you of The Minority Report, except it is less intrusive (thankfully). According to them, Law enforcement agencies deploying PredPol are experiencing marked drops in crime due to increased police presence in areas deemed to be at greatest risk.

Advertisements

Viptela SEN – DMVPN Done Right

Recently I had the treat of listening to two Layer 3 routing protocol maestros when the CTO of the startup Viptela, Khalid Raza, appeared on Ivan Pepelnjak’s Software Gone Wild podcast. Interestingly, the first time I had ever heard of Khalid or Ivan was through the Cisco Press books that they each authored. Ivan had the famous ‘MPLS and VPN Architectures‘ and Khalid, one of the first CCIEs, wrote ‘CCIE Professional Development: Large Scale IP Network Solutions‘, (which I owned an autographed copy of).

In a nutshell, Viptela’s Secure Extensible Network (SEN) creates hybrid connectivity (VPNs) across the WAN. Their target market is any large retailer or financial company, that has many branches. Khalid and the founder Amir Khan (of Juniper MX product line fame), come from super strong Layer 3 background and, consequently, they don’t purport to have a revolutionary solution. Instead, they have harnessed that background to improve on what DMVPN has been attempting to solve for the past 10 years. In Khalid’s words, they have “evolved MPLS concepts for overlay networks”.

Viptela SEN comprises a controller, VPN termination endpoints, and a proprietary protocol that is heavily inspired by BGP. In fact, one of the advisors of Viptela is Tony Li, author of 29 RFCs (mostly BGP-related), and one of the main architects of BGP. Viptela SEN can discover local site characteristics (such as the IGP) and report them to the controller, which then determines the branch’s connectivity policy. So it essentially reduces the number of control planes, which reduces the number of configurations for the WAN. This looks incredibly similar to what DMVPN sought out to do a decade ago. Viptela calls these endpoints dataplane points, but they still run routing protocols, so to me they’re just routers.

DMVPN, itself, started as a Cisco proprietary solution, spearheaded by Cisco TAC, in particular a gentleman by the name of Mike Sullenberger, who served as an escalation engineer. He has since coauthored an IETF draft on DMVPN. In fact, one of the earliest tech docs on cisco.com touts how ‘for a 1000-site deployment, DMVPN reduces the configuration effort at the hub from 3900 lines to 13 lines’.

Getting back to Viptela SEN, the endpoints (aka routers) authenticate with the controller (through exchange of certificates). Different circuits from different providers (MPLs or broadband) can be balanced through L3 ECMP. Their datapath endpoints are commodity boxes with Cavium processors that can give predictable (AES-256) encryption performance that tunnel to other endpoints (via peer-to-peer keys) as prescribed by the orchestrator/controller. In the event of a site-controller failures, if a site still has dataplane connectivity to another site that it needs to communicate with, then traffic can still forward (provided the keys are still valid) and all is well though the entries are stale.

One of the differentiators between Viptela and others in this space is that they do not build overlay subnet-based routing adjacencies. This allows them to offer each line of business in a large company to have a network topology that is service driven rather than the other way round. Translated in technical terms, each line of business effectively has a VRF with different default routes, but a single peering connection to the controller. In DMVPN terms, the controller is like the headend router, or hub. The biggest difference that I could tell between Viptela SEN and DMVPN is the preference given to L3 BGP over L2 NHRP. One of the biggest advantages of BGP has always been the outbound attribute change in the sense that a hub router could manipulate, via BGP MED, how a site could exit an AS. It is highly customizable. For example, majority of the sites could exit via a corporate DMZ while some branches (like Devtest in an AWS VPC) could exit through a regional exit point. In DMVPN, NHRP (which is a L2 ARP-like discovery protocol) has more authority and doesn’t allow outbound attribute manipulation which BGP, a L3 routing protocol has been doing successfully throughout the Internet for decades. NHRP just isn’t smart enough to provide that level of control-plane complexity.

Viptela SEN allows for each site to have different control policies – it could be a control plane path that says

The flexibility that Viptela SEN extends to a site can be at a control plane path level (e.g. ensure that certain VPNs trombone through a virtual path or service point like a firewall or IDS before exiting, as done in NFV with service chaining ) or data plane level (e.g. PBR). Since it promises easy bring-up and configuration, to alleviate concerns about SOHO endpoint boxes being stolen, they have a GPS installed in these lower end boxes. The controller only allows these boxes to authenticate with it if they are in the prescribed GPS coordinates. If the box is moved, it is flagged as a potentially unauthorized move and second-factor authentication is required in order to be considered as permissible. The controller can permit this but silently monitor the activities of this new endpoint box without its knowledge, akin to a honeypot. That’s innovation!

Deconstructing Big Switch Networks at NFD8

I recently caught up with the presentation made by Big Switch Networks at Networking Field Day 8.

Founder Kyle Forster kicked things off with an introduction to Big Switch. He used the term ‘Hyperscale Data Centers’ to describe the data center of today and tomorrow that Big Switch targets. Big Switch has two products based on the following three principles:

  1. Bare metal switches for Hardware using Broadcom ASICs.
  2. Controller-based design for software
  3. Core-pod architecture replacing the traditional Core-Aggregation-Edge tier.

The two products are:

  1. Big Tap Monitoring Fabric – Taps for offline network monitoring
  2. Big Cloud Fabric – Clos switching fabric

Next up, CTO Rob Sherwood went into more detail. He defined the core-pod architecture as essentially a spine-leaf architecture where there are 16 pods (racks that have servers), at the top of which are two leaves. Each server is dual-connected to a leaf via 10G links. The leaves themselves connect up to spines via 40G links. The leaf switches are 48x10G and 6x40G for uplinks; the spine switches are 32x40G. So the maximum number of spine switches in a pod is 6. (In a leaf-spine fabric every leaf must connect to every spine). That also means a maximum of 32 leaves can connect to a spine. These numbers will definitely increase in future generations of switches when Broadcom can produce them at scale. This solution is targeted at Fortune 1000 companies, not really as much on smaller enterprises. Pods are very modular and can be replaced without disrupting the older network designs.

What I thought was pretty cool was the use of Open Network Installation Environment (ONIE) for provisioning. The switches get shipped to customers from Dell or Accton with a very lightweight OS, then as it turn on the box it netboots from the Controller (an ONIE server). Both Switch Light (which is the Big Switch OS), as well as the relevant configurations, get downloaded from the Controller to the switch. LLDP is used to auto-discover the links in the topology, and management software will tell if there are missing or double-connected links.

In the first demo, the Automated Fabric Visibility Tool was used to first allocate and assign roles in the topology. At that point, any errors in cabling would appear in the GUI, which was pretty user-friendly. The Big Cloud Fabric architecture has a dedicated OOB control/management network that connects to the Controller. Amongst the management primitives are a logical L2 segment (ala VLAN) that have logical ports and end-points, tenants that are logical grouping of L2/L3 networks, and logical routers that are the tenant routers for inter-segment or intra-segment routing. Each logical router corresponds to a VRF. VLAN tags can be mixed and matched and added into bridged domains. The use case would be analogous to a multi-tenant environment in each ESX instance, when you declare egress VLAN tags on vswitch in VMware deployments. You have the choice of specifying the tag as global fabric or local to the vswitch. Interestingly, Big Switch used to have an Overlay product two years ago and ended up tossing it away (because they feel they are L2 solutions only, not L3 solutions) to come up with the current solution because they believe it uses the hardware the way it was designed to be used.

The next demo was to create tenants, assign servers and VMs to logical segments by VLAN, physical ports, or port-groups to meet a use case of a common two-tier application.

The fabric in Big Cloud Fabric is analogous to a sheet metal chassis-based fabric that has fabric backplanes, line cards, and supervisors/management modules in that the spine switches are the backplanes, the leaf switches are the line cards, and the Controllers are the supervisors. The analogies actually don’t end with the components. Sherwood explained that traditional chassis switch vendors use proprietary protocols between their backplanes and their line cards that is actually Ethernet and is, therefore, no different from the OOB management network between spine switches and leaf switches. The control planes and the data planes in Big Cloud Fabric are completely decoupled so that in the event of the management switch completely going down, you only lose the ability to change and manage the network. So for example, if a new server comes up, routes for that host don’t get propagated. Of course, if both supervisors in a Nexus 7K go down, the whole switch stops working. If both Controllers go down simultaneously, the time needed to bring up a third Controller is about 5 minutes.

Big Cloud Fabric is based on OpenFlow with extensions. The white box switches that Big Switch sells have Broadcom ASICs that have several types of forwarding tables (programmable memory that can be populated). Some of the early implementations of OpenFlow only exposed the ACL table (which had only about 2000 entries), which didn’t scale well. The way Big Switch implements OpenFlow in Switch Light OS is to expose an L2 table and an L3 table, each with over 100,000 entries. They couldn’t go into more details as they were under NDA with Broadcom. Switch Light OS is Big Switch’s Indigo OpenFlow Agent running on Open Network Linux on x86 or ASIC-based hardware. Whereas traditional networks have clear L2/L3 boundaries in terms of devices, in Big Cloud Fabric L3 packets are routed on the first hop switch. If a tenant needs to talk to another tenant, packets go through a system router, which resides only on a spine switch.

Next up was Service Chaining and Network Functions Virtualization support. Service Chaining is implemented via next-hop forwarding. For example, at a policy level, if one VM or app needed to talk to another app, it could be passed through a service such as a load balancer or firewall (while leveraging the ECMP bits of the hardware) before reaching the destination. The demo showed how to create a policy and then, with a firewall service example, how to apply that policy, which is known as service insertion to an ECMP group. However, it is early days for this NFV use case and for more elaborate needs such as health monitoring, the recommendation is to use OpenStack. Interestingly, Big Switch integrates with OpenStack, but not VMware at this time (it is on the roadmap though).

Operational Simplicity was next on the agenda. Here Big Switch kicked off with the choice of control plane APIs to the user – CLI, GUI, or REST, which, generally speaking, would appeal to network engineers, vCenter administrators, and DevOps personnel respectively. High Availability is designed so that reactions to outages are localized as much as possible. For example, the loss of a spine only reduces capacity, the loss of a leaf is covered by the other leaf in the same rack (thanks to a dedicated link between the two) that has connections to the same servers (so the servers failover to the other leaf via LACP). The Hitless Upgrade process is truly hitless from an application perspective (a few milliseconds of data packets are lost) though capacity is reduced. A feature called Test Path shows the logical (at a policy level) as well as physical path a host takes to reach another host.

The final session was on the Network Monitoring features of Big Switch, namely Big Tap Monitoring Fabric. Sunit Chauhan, head of Product Marketing, said that the monitoring infrastructure is developed using the same bare metal switches that is managed from the same centralized controller. The goal is to monitor and tap every rack and ideally every vswitch. In a campus network that means the ability to filter traffic from all locations to the tools. The Big Tap Monitoring Controller is separate from the Big Cloud Fabric Controller and runs Switch Light OS as well. The example he gave was of a large mobile operator in Japan that needed thousands of taps. The only scalable (in terms of cost and performance) solution to to monitor such a network was to use bare metal Ethernet switches that report to a centralized SDN Controller.

The Big Tap Monitoring demo was based off a common design with a production network (which could be Big Cloud Fabric or traditional L2/L3 networks) with filter ports connected to a Big Tap Controller, which was then connected via delivery ports to the Tool Farm, where all the visibility tools existed.Of course Big Switch eats its own dogfood like every noble startup by deploying Big Tap Monitoring Fabric in its own office. They were able to capture the actual video stream of the NFD event that went out to the Internet from their office. Remote Data Center Monitoring is also supported now (though not demonstrable at NFD8), which reminded me of RSPAN except that this used L2-GRE tunnels.

A few afterthoughts: Big Switch used the marketing term ‘hyperscale data center’ like it was an industry standard and they gave examples of networks that were not hyperscale without explaining how they weren’t. In fact there was a slide that was dedicated to terminology used in a demo, but ‘hyperscale’ was not there. It reminded me of my days in a startup that used that same term in its marketing headlines without ever defining it.

From a personal perspective, in 2010 I worked as a Network Engineer in a large financial exchange where the Big Tap Monitoring Fabric would have been invaluable. Any time a trade was delayed by a couple of seconds resulted in potentially millions of dollars. The exchange would be spared that penalty if it could be proved that the delay was due to the application or the remote network and not the exchange’s own network. At that time we used network monitoring switches to determine where the delay occurred. But the location of those taps was critical. Moreover, it was just not scalable to have taps at every location off of every port. Since it was a reactive (troubleshooting) effort, it was really a Whac-a-Mole exercise. Ultimately, we went with a vendor that built the infrastructure to collect latency data from exchanges, and then offered the results to firms to allow them to monitor data and order execution latency on those markets. But it was expensive and those investments were between $10 and $15 million and ultimately that vendor went out of business. A solution like Big Tap Monitoring Fabric would have been a godsend. If Big Switch can figure out how to keep their costs down, they may have a huge opportunity in hand.

Tech Field Day events are intended to be very technical and this was no different. Slides with Gartner Magic Quadrants are usually met with rolling eyeballs, but I think Big Switch can be forgiven for having one reference to an industry analyst. Apparently according to Dell ‘Oro, in 2013 more ports were shipped from bare metal switches (from vendors such as Dell, Accton, and Quanta) that from Arista, Juniper, and Extreme combined!

While Big Cloud Fabric competes against the Cisco Nexus 7K product line, Big Tap Monitoring goes head to head against Gigamon. It was very refreshing to see a startup take on two behemoths, sheerly with clever engineering and nimble principles.

Deconstructing Nuage Networks at NFD8

I enjoy Tech Field Day events for the independence and sheer nerdiness that they bring out. Networking Field Day events are held twice a year. I had the privilege of presenting the demo for Infineta Systems at NFD3 and made it through unscathed. There is absolutely no room for ‘marketecture’. When you have sharp people like Ivan Pepeljnak of ipSpace fame and Greg Ferro of Packet Pushers fame questioning you across the protocol stack, you have to be on your toes.

I recently watched the videos for NFD8. This blog post is about the presentation made by Nuage Networks. As an Alcatel-Lucent venture, Nuage focuses on building an open SDN ecosystem based on best of breed. They had also presented last year at NFD6.

To recap what they do, Nuage’s key solution is Virtualized Services Platform (VSP), which is based on the following three virtualized components:

  • The Virtualized Services Directory (VSD) is a policy server for high level primitives from Cloud Services. It gets service policies from VMware, OpenStack, and CloudStack and also has a builtin business logic and analytics engine based on Hadoop.
  • The Virtualized Services Controller (VSC) is the control plane. It is based on ALU Service Router OS, which was originally developed 12-13 years ago and is deployed in 300,000 routers, now stripped to be relevant as an SDN Controller. The scope of Controller is a domain, but it can be extended to multiple domains or data centers via a BGP-MP federation, thereby supporting IP Mobility. A single availability domain has a single data center zone. High availability domains have two data center zones. A VSC is a 4-core VM with 4 GB memory. VSCs act as clients of BGP route reflectors in order to extend network services.
  • The Virtual Routing and Switching module (VRS) is the Data Path agent that does L2-L4 switching, routing, and policies. It integrates to VMware via ESXi, XEN via XAPI, and KVM via libvirt. The libvirt API exposes all the resources needed to manage the support of VMs. (As a side, you can see how it comes into play in this primer on OVS 1.4.0 installation I wrote a while back.) The VRS gets the full profile of the VM from the hypervisor and reports that to the VSC. The VSC then downloads the policy from the VSD and implements them. These could be L2 FIBs, L3 RIBs/ACLs, and/or L4 distributed firewall rules. For VMware, VRS is implemented as a VM with some hooks because ESXi has a limitation of 1M pps.

At NFD8, Nuage discussed a recent customer win that demonstrates its ability to segment clouds. The customer was a Canadian Cloud Service Provider (CSP), OVH, that has deployed 300,000 servers in its Canadian DCs. OVH’s customers can, as a beta service offering, launch their own clouds. In other words, it is akin to Cloud-as-a-Service with the Nuage SDN solution underneath. It’s like a wholesaler of cloud services whereby multiple CSPs could businesses could run their own OpenStack cloud without building it themselves. Every customer of this OVH offering would be running independent Nuage’s services. Pretty cool.

Next came some demos that address following 4 questions about SDN:

  1. Is proprietary HW needed? The short answer is NO. The demo showed how to achieve Hardware VTEP integration. In the early days of SDN, overlay gateways proved to be a challenge because they were needed to go from the NV domain to the IP domain. As a result VLANs needed to be manually configured between server-based SW gateways and the DC routers – a most cumbersome process. The Nuage solution solves that problem by speaking routing language, uses standard RFC 4797 (GRE encapsulation) on its dedicated TOR gateway to tunnel VXLAN to routers. As covered in NFD6, Nuage has three approaches to VTEP Gateways:
    1. Software-based – for small DCs with up to 10 Gbps
    2. White box-based – for larger DCs based on standard L2 OVSDB schema. In NFD8, two partner gateways were introduced – Arista and the HP 5930. Both feature L2 at this point only, but will get to L3 at some point.
    3. High performance-based (7850 VSG) – 1 Tbps L3 gateway using merchant silicon, and attaining L3 connectivity via MP-BGP
  2. How well can SDN scale?
    The Scaling and Performance demo explained how scaling in network virtualization is far more difficult than scaling in server virtualization. For example, the number of ACLs needed grows quadratically as the number of web servers or database servers increases linearly. The Nuage solution breaks down ACLs into abstractions or policies. I liken this to an Access Control Group, whereby ACLs fall under an Access Control Group. Another way of understanding this is Access Control Entries being part of an Access Control List (for example, an ACL for all web servers or an ACL for all database servers) so that the ACL is more manageable. Any time a new VM is added, it is a new ACE. So, policies are pushed, rather than individual Access Control Entries, which scales much better. Individual VMs are identified by tagging routes, which is accomplished by, you guessed it right, BGP communities (these Nuage folks sure love BGP!).
  3. Can it natively support any workload? The demo showed multiple workloads including containers in their natural environments without being VMs, i.e. bare metal. Nuage ran their scalability demo on AWS with 40 servers. But instead of VMs, they used Docker containers. Recently, there has been a lot of buzz around Linux containers, especially Docker. The advantage containers hold over VMs is that they have much lower overhead (by sharing certain portions of the host kernel and operating system instance), allow for only a single OS to be managed (albeit Linux on Linux), have better hardware utilization, and have quicker launch times than VMs. Scott Lowe has a good series of writeups on containers and Docker on his blog. Also, Greg Ferro has a pretty detailed first pass on Docker. Nuage CTO Dimitri Stiliadis explained how containers are changing the game as short-lived application workloads are becoming increasingly prevalent. The advantages that Docker brings, as he explained, is to move the processing to the data rather than the other way round. Whereas typically you’d see no more than 40-50 VMs on a physical server, the Nuage demo had 500 Docker container instances per server. So there were 20,000 container instances total. And they showed how to bring them up along with 7 ACLs per container instance (140K ACLs total) in just 8 minutes. That’s 50 containers or VMs per second! For reference, in the demo, they used an AWS c3.4xlarge instance (which has 30GB memory) for the VSD, a c3.2xlarge for the VSC, and 40 c3.xlarge instances for the ‘hypervisors’ where the VRS agents ran. The Nuage solution was able to successfully respond to the rapid and dynamic connectivity requirements of containers. Moreover, since the VRS agent is at the process level (instead of the host levels with VMs), it can implement policies at a very fine control. Really impressive demo.
  4. How easily can applications be designed?
    The Application Designer demo here showed how to bridge the gap between app developers, and infrastructure teams by means of high level policies to make application deployment really easy. In Packet Pushers Show 203, Martin Casado and Tim Hinrichs discussed their work in OpenStack Congress, which attempts to formalize policy-based networking so that a Policy Controller can abstract high level, human-readable primitives (which could be HIPAA, PCI, or SOX as an example), and express them in a language to an SDN Controller. Nuage confirmed that they contribute to Congress. The Nuage demo defined application tiers and showed how to deploy an WordPress container application along with a backend database in seconds. Another demo integrated OpenStack Neutron with extensions. You can create templates to have multiple instantiations of the applications. Another truly remarkable demo.

To summarize, the Nuage solution seems pretty solid and embraces open standards, not for the sake of lip service, but to solve actual customer problems.

An afternoon with the inventor of Ethernet – Bob Metcalfe

Earlier this month I had an opportunity to attend a talk by the most well-known co-inventor of Ethernet – Bob Metcalfe. In May, the networking world celebrated the 40th anniversary of the invention of Ethernet at the Computer History Museum, where Metcalfe was honored and invited to speak to employees of the company that he had founded in 1979 – 3Com. Of course, HP acquired 3Com in 2010, so he had really come to HP to talk on the evolution of Ethernet as well as what has kept him busy the past 40 years.

Metcalfe began by stating that the design behind the Ethernet protocol he co-invented in 1973 had changed so significantly that he is often given far more credit than he deserves. Amusingly, however, he said he will not give that credit back. He believes that there is very little in common between the types of Ethernet standards we have today from the IEEE (Gigabit Ethernet, Ten Gigabit Ethernet, 40 Gigabit Ethernet, 100 Gigabit Ethernet) and the original 2.94 Mbps standard that he came up with with the intent of printing 500 dots per inch with a speed of one page per minute. The day of 1 Terabits per second Ethernet is not far off, with the dependency being on the IEEE assessing the availability of components in a timeframe so that devices can be made economically.

Metcalfe spoke of the battles Ethernet had with Token Ring in the early days. Token Ring was heavily backed by IBM, but had rigid standards and was inflexible to the growing needs of the market. Ethernet, on the other hand, was able to spread widely because it constantly adapted, the prime example being the opening up of media support from thick-net coax to twisted pair cable. Ethernet also sought support for higher speeds, soon from 10 Mbps to Fast Ethernet (100 Mbps) by which time Token Ring with it’s support for just 4 Mbps and 16 Mbps was proven obsolete. Another reason Ethernet thrived, he claimed, was that it was not designed to solve every problem. For example, in the ISO hierarchy, Ethernet does not address Security (mainly because he felt it was not appropriate to solve that problem at the hardware level). Of course, now it is standard design to be cross-checking the source address field. He then went into a tangent of how he believes the Internet has an ideological problem in that anonymity is given a high priority, which is a mistake. Metcalfe feels the ability to have anonymity should be an exist, but not as the default.

He talked at length of the pervasiveness of Ethernet into various horizontals. For example, while Ethernet was designed to be a LAN in a building, it has also entered the WAN by killing SONET, an accomplishment he has taken significant pride in. SONET and T1 were both introduced by AT&T, the other ‘big bad corporation of the time’. At 1.544 Mbps, a T1 circuit was half the speed of the original 2.94 Mbps standard, and it was only a matter of time before future Ethernet standards would prevail despite the emergence of X.25, Frame Relay, and ATM. Today, across the WAN, Ethernet is represented as Carrier Ethernet in a $34 billion market of equipment. And of course Ethernet has also manifested itself wirelessly as WiFi.

Today, Bob Metcalfe is a Professor of Innovation at the University of Texas at Austin. Previously he was a Venture Capitalist at Polaris Ventures for 10 years. On the topic of innovation, he also spoke of a few models, including one he coined – Doriot Ecology, named after George Doriot, one of the first modern VCs from Harvard Business School. The premise of the Doriot Ecology are briefly:

  • Startups out of research universities are the most effective at innovating. However, they also depend on funding agencies like NSF and DARPA, research professors, graduating students, scaling entrepreneurs,  strategic partners (such as the large network vendors like HP), and early adopters.
  • Startups need partners to scale. Large companies need to practice open innovation, and be receptive to ideas that come from the outside.

He related a few other models with the way businesses were run in his days at 3Com and Xerox PARC:

  • Intrapreneurship – Here, innovation comes from inside the company. 3Com never had a research division, but tried to push its product groups out to find a prospect. In such cases often the money making groups try to kill off the research group because they don’t generate any revenue and there is pressure at every budget cycle.
  • Spin-in, where efforts to come up with innovation are put up outside the company with the understanding that if it succeeds, it will come back to the parent company. We’ve seen that with Insieme.
  • Spin-outs, where the company has to decide whether it will be hostile to the spin-out or supportive. Metcalfe talked about how Xerox noticed that Adobe, Apple, Sun, 3Com were all exploiting technologies (such as the mouse and the GUI) that were developed at PARC. At that point Xerox started investing in their spin-outs rather than being hostile to them.

Metcalfe said corporate research has deteriorated a lot since his days and should not be reconstituted. While Xerox PARC is now known as PARC, it is nothing like it was forty years ago. For its strength of 25,000 employees, the now defunct Bell Labs ‘only’ had the transistor, Unix, the Princess telephone, and DWDM to show for. (I’m not the first one to note that Bob Metcalfe tends to make controversial statements!) He argued that the only companies that can afford to undertake research (not development, but science) are monopolies. And as was seen at Xerox PARC, monopolies are the least motivated to scale up technologies that they develop. Funds should not be put in corporate research labs or government research labs. In his opinion research should be left up to research universities, such as UT Austin, Berkeley, and Stanford. Professors should be encouraged to start more companies.

He claimed to have a short attention span to re-spin his career every 10 years, from 3Com to Venture Capitalist to Professor. Isn’t it fun going up a learning curve? He finished by saying he recently learned Python in a Massively Online Open Course (MOOC) that he took with his son and 60,000 other students (he got a 90, his son got a 70).

Challenges that Plexxi Faces

This week, the Lean Startup conference was held in San Francisco. The Lean Startup philosophy borrows its roots from the lean management method of manufacturing, with Kanban or Just-in-time processing at the center of the design principle. It basically advocates startups to minimize outside funding, not strive for the perfect product (think Minimal Viable Product), be flexible (think Pivot), and cater the product completely to the customer’s needs, all with the goal of being a highly efficient company.

However, as noted investor Marc Andreessen, who spoke at the conference, warns,

Not all startups can be Lean Startups

Indeed, some startups cannot afford to employ a Pivot. Infrastructure or hardware companies come to mind, especially when they’ve already taken $50 million of investment. This is what Plexxi has done without a complete solution to show for yet.

I was listening to the recent Packet Pushers show #126 sponsored by Plexxi. While their approach is a creative one, I’m not sure whether it is viable. In a nutshell, Plexxi brings optical technology, in the form of WDM, to the Data Center and flattens traditional hierarchical network designs. When I first started learning about network designs, the classical approach was the 3-tier Core-Distribution-Access model. In the mid-2000s this got reduced to a Collapsed Core. What Plexxi proposes is a flat topology, eliminating the need for Core switches in the Data Center.

Plexxi adopted the SDN approach of a programmable controller (a virtual appliance) that pushes policies to its switches. The policies are intended to optimize data path flow for affinitized traffic. Applications that are more sensitive of certain resources are classified in Affinity Networks. Some example of the constraints or sensitivities that Plexxi’s Director of Product Management, Marten Terpstra, described include:

  • Hop-count
  • Bandwidth

Plexxi switches use merchant silicon (Broadcom ASICs) to form an Ethernet ring on top of a WDM lightwave. By changing lambdas, Layer-1 connections between switches can be changed according to the application requirements.

Plexxi uses their own closed APIs for communication between their switch interfaces and their controller, in order to convey their message of affinities. However, they open up their proprietary northbound API for user-to-controller communication so users can write scripts, for example, by using REST APIs. Interestingly, they are a member of Open Network Foundation. The Controller places TCAM entries in switches based on application requirements for affinitized traffic.

Terpstra discussed two use cases:

  1. Affinitized iSCSI traffic for most bandwidth with least number of hops
  2. Cloud provider – Use a Plexxi ring as a premium service to affinitize traffic.

In neither case are the results mentioned.

Okay, so so far Plexxi’s solution is a 1 RU box that can prioritize traffic based on hop-count and bandwidth. I fail to see much of a business case there. Any network engineer worth his or her salt will tell you that there is more to traffic classification and prioritization than just hop-count and bandwidth. Financial trading institutions would be more concerned about latency guarantees. Hop Count alone is a flimsy criterion to classify important traffic, regardless of whether a cute term like Affinity Network is given to that classification. High Availability is a critical issue that a ring topology exacerbates. As Doug Gourlay of Arista mentions, unnecessary downtime is introduced any time you add new nodes because the ring is broken. Moreover, the network is reduced to a split brain model in the even of just two nodes going down. Depending on the Controller placement, this could have adverse outcomes. The thing about outages is that we can never control where they occur. Gourlay rightly puts it:

I thought Token Ring died for good reasons… why is someone trying to bring it back?

Getting back to the Lean Startup idea, Terpstra said “Our Layer-3 affinities are coming”. Plexxi is targeting Christmas 2012 for 1.0 version of Layer-3 capabilities. Until then Plexxi only has a Layer-2 switch with no quotable value to show for $50 million in investment. Not a good time to Pivot.

Reports of the death of the Core switch in the Data Center have been greatly exaggerated.