Earlier in my career, I worked as a Network Engineer in the high-frequency trading industry at a capital market exchange. It was the time when electronic trading was gaining heavy momentum as open outcry was receding. This was thanks mainly in part to vendors such as Arista who leveraged merchant silicon from Broadcom to lead the charge of low-latency networking.
Scores of trading firms would set up their equipment in one of the exchange’s many data centers inside the building to practice latency arbitrage. Speed was the name of the game and livelihoods were hedged on the network’s ability to pass packets as quickly as possible.
In the early days, any time there was a significant delay (could be as low as 1-2 seconds), the exchange would get hit with hefty fines. However, if we could prove that it was not the fault of the network, but rather the application that caused a trade to execute slowly, then we were off the hook. So my team invested in several network taps and sniffers from NETSCOUT and Gigamon to perform forensic analysis on these low-latency, high-throughput financial systems.
But there were never enough taps. Taps allowed us to pinpoint the location and cause of delays and retransmissions if we were lucky enough to have placed them at the exact spot in the network where the delay was incurred. It was like a playing a game of whack-a-mole. Providing evidential data was a nightmare in those days. There was such little visibility.
Did I mention we owned the entire network?
Fast forward to public clouds today which are complete black boxes. They provide very little visibility and the network has no way to prove it is not at fault because there have been no tools that are able to extract meaningful data until Aviatrix CoPilot came along. It already had the ability to display NetFlow records to provide such empirical data. Take this screenshot as an example.
If I were to see a flow with a few SYNs coming in, for example, I could use that information to ask the Application team whether everything is okay on their end. Or if I see a SYN followed immediately by a RST, that might point in the direction of a firewall blocking something. Or maybe if PSH packets are going through fine and data is being passed for a while, it might be another indication of the network doing its job and the application developer needing to be pulled in. It’s a very powerful feature.
But with the new AppIQ feature released this week in CoPilot, visibility is taken to the next level. AppIQ allows you to generate a comprehensive report of latency, traffic, and performance monitoring data between any two cloud instances connected via your Aviatrix transit network, such as shown here with an SSH test.
Now you can see latencies on a hop-by-hop basis. AWS us-east-1 (N. Virginia) to us-east-2 (Ohio) regions are about 12 ms away on average. And each of those green links represents an encrypted tunnel.
End-to-end encryption in the cloud with the visibility: that’s what every network engineer dreams of having.