
I’m going to recommend our pcap_stats ( ).

Locating these flows isn’t too difficult if you have the right tool. This is exactly what we’re aiming to do identify high volume traffic that we no longer want to capture and analyze, discard it, helping our capture and analysis tool to keep up. If I want to see everything except that replication traffic, I invert it with “not” (a “discard” filter): tcpdump -i eth0 -qtnp 'not (host 7.8.9.10 and host 14.15.16.17 and tcp port 3306)' , which will just show me that replication traffic.

I can use that BPF in almost any application that captures packets to only show me this replication traffic, like: tcpdump -i eth0 -qtnp '(host 7.8.9.10 and host 14.15.16.17 and tcp port 3306)' If I have two mysql database servers, a primary (7.8.9.10) and a secondary (14.15.16.17) and have a lot of replication traffic going from the primary to the secondary, I can use this BPF to describe this traffic: '(host 7.8.9.10 and host 14.15.16.17 and tcp port 3306)' This may end up discarding between approximately 10% and 80% of the raw packets, leaving our capture and analysis tools able to keep up with the remaining 90% to 20% of the incoming stream. Once we’ve found these, we put together a BPF (a “Berkeley Packet Filter”) an expression that specifically describes this traffic so that we can tell the capture library to discard it. The goal is to find one or more traffic types that 1) have lots of packets and/or lots of combined bytes in those packets, 2) are limited to a small number of ports and IP addresses, and 3) are trusted very unlikely to have anything malicious inside.

In that blog I mentioned discarding high volume flows as a way to avoid those problems, but never explained how to find them. We’ve talked about the general process of speeding that up (see “ Improving Packet Capture Performance“). It’s easy to have a system where the network interface, processor, or disk can limit how many packets can be processed in a second, leading to packet capture loss. One of the most common problems in capturing and analyzing packets is making sure that the capture system can keep up with the flood of traffic.
