Scaling Network Security Technologies by Decoupling Traffic Ingestion from Analysis using Open-Source
Thanks to my good friend Marc Uchniat for helping develop this idea
As networks grow larger and faster, monitoring technologies have a difficult time scaling. Most times, solutions involve splitting traffic into smaller and smaller feeds from network tap infrastructure, and running analysis tools at line-rate on that traffic. If the traffic volume exceeds the processing capabilities of the machine, frames are dropped, flows cannot be properly analyzed, and threats may go undetected.
This proposal allows analysis to occur asynchronously, allowing it to be scaled arbitrarily between any number of nodes running various software, with guaranteed delivery and processing of all observed flows. I’ve been writing the necessary components for this pipeline in Rust over the past few weeks, and as things come together they will be open-sourced under the GPL.
The first step is an extremely lightweight TCP flow re-assembler using bindings to AF_PACKET. It can handle traffic far more efficiently than systems that also handle analysis of that traffic. Once flows are reassembled, they are sent to an Apache Kafka pipeline on a topic that is subscribed to by various clients doing traffic analysis. As part of the pipeline, Elasticsearch or Stenographer might be used as a rolling buffer to retain a few hours of full packet data. This would be useful, for example, if an IDS system generates an alert, the rolling buffer could be queried to generate a pcap containing the flow that generated the alert, as well as related traffic, with an arbitrary granularity. That pcap could then be sent to long-term storage for an analyst to look at, without having to store more network traffic than necessary.
Essentially, the Kafka infrastructure becomes a giant buffer for network traffic, providing guaranteed delivery, and an arbitrary number of clients and Kafka nodes can be added to accommodate the necessary workload, instead of building larger and larger individual systems that are tasked with analyzing traffic at line-rate.
Related repositories (these are all VERY early-stage):
https://gitlab.com/DominoTree/rs-af_packet (Rust bindings for the Linux kernel AF_PACKET API)
https://gitlab.com/DominoTree/rs-flow_reassembler (Very basic TCP flow reassembler)
https://gitlab.com/DominoTree/rs-kafka_flow (Application to tie the previous two libraries together and send data to Kafka)
https://gitlab.com/DominoTree/rs-kafka_flow_replayer (Replays TCP flows from Kafka on a virtual network interface at an arbitrarily-specified rate)