FloSIS is a multi-10Gbps network flow capture system that supports real-time flow indexing for fast flow retrieval and flow-content deduplication for enhanced storage efficiency.
Network packet capture performs essential functions in modern network management such as attack analysis, network troubleshooting, and performance debugging. As the network edge bandwidth currently exceeds 10 Gbps, the demand for scalable packet capture and retrieval is rapidly increasing. However, existing software-based packet capture systems neither provide high performance nor support flow-level indexing for fast query response. This would either prevent important packets from being stored or make it too slow to retrieve relevant flows.
A research team led by Professor KyoungSoo Park and Professor Yung Yi of the School of Electrical Engineering at Korea Advanced Institute of Science and Technology (KAIST) have recently presented FloSIS, a highly scalable software-based network traffic capture system that supports efficient flow-level indexing for fast query response.
FloSIS is characterized by three key advantages. First, it achieves high-performance packet capture and disk writing by exercising full parallelism in computing resources such as network cards, CPU cores, memory, and hard disks. It adopts the PacketShader I/O Engine (PSIO) for scalable packet capture and performs parallel disk writes for high-throughput flow dumping. Towards high zero-drop performance, it strives to minimize the fluctuation of packet processing latency.
Second, FloSIS generates two-stage flow-level indexes in real time to reduce the query response time. The indexing utilizes Bloom filters and sorted arrays to quickly reduce the search space of a query. Also, it is designed to consume only a small amount of memory while allowing flexible queries with wildcards, ranges of connection tuples, and flow arrival times.
Third, FloSIS supports flow-level content deduplication in real time for storage savings. Even with deduplication, the system still records the packet-level arrival time and headers to provide the exact timing and size information. For an HTTP connection, FloSIS parses the HTTP response header and body to maximize the hit rate of deduplication for HTTP objects.
These design choices bring enormous performance benefits. On a server machine with dual octa-core CPUs, four 10Gbps network interfaces, and 24 SATA disks, FloSIS achieves up to 30 Gbps for packet capture and disk writing without a single packet drop. Its indexes take up only 0.25% of the stored content while avoiding slow linear disk search and redundant disk access. On a machine with 24 hard disks of 3 TB, this translates into 180 GB for 72 TB total disk space, which could be managed entirely in memory or stored into solid state disks for fast random access. Finally, FloSIS deduplicates 34.5% of the storage space for 67 GB of a real traffic trace only with 256 MB of extra memory consumption for a deduplication table. In terms of performance, it achieves about 15 Gbps zero-drop throughput with real-time flow deduplication.
This work is presented at 2015 USENIX Annual Technical Conference (ATC) on July 10 2015 in Santa Clara, California.
Materials provided by Korea Advanced Institute of Science and Technology. Note: Content may be edited for style and length.
Cite This Page: