Category Archives: Operations

Introducing ACE Cloud Operations

Recently Aviatrix developed a new course in the Aviatrix Certified Engineer (ACE) program. Aviatrix Certified Engineer – Multi-Cloud Network Operations (or ACE Cloud Ops for short) is catered towards cloud operations practitioners who need to successfully run, operate, and manage business-critical Day-2 workloads in the cloud.

The ACE program recently announced its 10,000th certified engineer. That’s a phenomenal achievement considering our stretch goal for the year 2020 was only 500. It’s amazing how Covid 19 has resulted in expanding our reach to hundreds of students per week.

ACE Cloud Ops takes a unique view on operating cloud infrastructure, which is necessarily different from operating on-prem infrastructure.

Operations in the On-Prem World

In the On-prem world, enterprises own the underlay. They have full control over traffic patterns and have a familiar toolkit regardless of what vendor they use on-prem.

Of course some tools, such as SNMP died away, but ICMP-based tools such Ping and Traceroute are still going strong 40 years after RFC 792. IP doesn’t go away when you move to the cloud and neither should the network engineering toolkit.

Key skills for Infrastructure Operations engineers include:

  • Hardware (knowledge of cables, transceivers, switches, routers, racks, real estate, physical security, power, cooling)
  • Layer 2 (Spanning Tree is the worst use of an Operations Engineer’s time)
  • OSPF, BGP
  • Repeatability achieved by scripting tools such as Expect (which is really screen-scraping), Shell, Perl, Python (still invaluable). This is not true automation.

Capacity planning in the on-prem world often involves ordering the right number of spares to plan for outages, so that there is some form of high availability, although it does result in higher RPOs and RTOs.

We all know the financial benefits (when done well) of moving apps to the cloud. But while it offers great agility for Developers (you can  spin up a database within minutes), networking has been slow to catch up. Moreover, as we see a rapid shift towards multi-cloud, Operations teams are left on their own without guidance.

Operations in the Cloud World

Operations engineers have a harder time doing their job because of the lack of toolsets afforded to them by Cloud Service Providers (CSPs). Each CSP has proprietary tools that are intended to keep their customers locked into their cloud. Moreover, networking is not a source of revenue for CSPs. They don’t make networking easy and their networking tools are, simply put, not enterprise-ready. 

For example, consider what it takes just to view a route table in Azure. An intuitive approach would be to list the routes from the VNet or at least have a direct link to it. However, you would be mistaken into thinking that way.

Instead, buried in a list of connected devices in that VNet, you have to select the appropriate NIC, which may have an obscure ID.

Next, you have to select an even more obscure term called ‘Effective Routes’

Only then can you see the routes.

It is a very clunky approach to a routine task in the On-prem world. Of course the problem grows exponentially when having to deal with the oddities of each cloud when the enterprise goes multi-cloud. Each CSP abandons the networking toolkit and offers their platform as a blackbox to Operations teams.

When moving to the cloud, an Operations Engineer must have these new skills at a minimum:

  • Agile mindset
  • Infrastructure as Code (read Terraform)
  • CI/CD
  • VCS

Capacity planning takes place with cloud-native principles, such as elasticity and auto-scaling. It requires a new way of thinking, not just for Developers, but also for Operations teams. 

ACE Cloud Ops

The ACE Cloud Ops course better equips Cloud Operations teams  to run a multi-cloud network in their daily jobs. It builds on the immensely popular ACE program with some of the most common use cases we see our customers when operating in any cloud:

  • How to Ensure Business Continuity with an Enterprise-class Transit Solution
  • How to Strengthen Compliance and Audit Initiatives by providing Monitoring and Troubleshooting for Cloud Security Appliances
  • How to Efficiently Connect Remote Sites to Cloud
  • How to Improve your Cloud Egress Security posture
  • Best Practices for Platform Operations Management
  • DevOps for Network Engineers

There are also hands on labs focused on break-fix scenarios that are based on this topology:

The source code of the Terraform that built this topology is here.

ACE Associate is a pre-requisite for ACE Cloud Ops. 

Submit interest for taking ACE Cloud Ops here.