VMware Project Aegis

I spent the summer of 2017 as an R&D intern at VMware's Networking and Security group. Working in a group of three I developed Project Aegis, a machine learning powered approach to firewall policy. We filed two patents for our work, and presented our paper at RADIO, the company's exclusive internal research conference.

Motivation

Grouping together nodes performing similar functions within a datacenter is a hard problem. It's also an important part of security policy enforcement, and bad clustering leads to a whole range of security problems. Today, even for large scale deployments, network adminstrators create these groups manually :

  1. Using existing knowledge of the applications and network topology
  2. By the tier (web, app or DB)
  3. By the application or service ports open on them
None of these methods are fool-proof, and all of them are painstaking and slow. Our goal was to design a machine learning platform that would ingest network flow data, and autonomously classify workloads into groups based on their traffic patterns. Network adminstrators can then assign labels to these clusters, and use them as building blocks for microsegmented security policy.

Implementation

A modified version of Latent Dirchlet Allocation topic modelling formed the basis of our approach. Not only is it unsupervised in learning, but it runs in real-time on distributed computation, making it suitable to our approach.
The platform to run our algorithm was designed to be as general as possible, so that future ML projects would benefit from it as well. An IPFix collector would receive data in realtime from the NSX Manager. We used Spark, Kafka, Hadoop, and AirFlow for the analysis pipeline.