Hi, this is George Jones, I was conference chair of the 10th annual FloCon Conference that was held in Charleston, South Carolina, January 13-16, 2014. Check out the FloCon proceedings to learn about the work presented, and consider participating in future FloCons.
For those not familiar with network flow (a.k.a. "flow"), it is a collection of "call records" for the Internet (i.e., who talked to whom, when, and for how long). Flow is defined by the "5-tuple" of source and destination IP addresses, source and destination ports and protocols. So, for example, a connection from a web client to a web server might look like the following:
sIP| dIP|sPort|dPort|pro| packets| bytes| flags 192.168.1.2| 184.108.40.206|60790| 80| 6| 92| 6107| SRPA
The SiLK and argus tool suites are examples of tools that can be used for doing analysis with flow. The IETF created a standardized version of flow called IPFIX, but there are many other versions of flow.
Some of the advantages of using flow for network analysis compared to other available data sources are
- small size per record (SiLK records are 56 bytes and compress to about 21 bytes.)
- longer retention times enabled by the smaller size
- ubiquitous ability to generate flow records (Nearly all enterprise-grade network gear can produce flow records. If flow generation is enabled and data moves, there is a record.)
- privacy and anonymity (Network flow does not contain information such as names, social security numbers, or URLs of specific web pages. As such, it has fewer privacy issues.)
Over the years, FloCon has evolved. The focus has always been on the operational use and analysis of network flow data with a heavy emphasis on security. General topics have included flow data collection, storage, analysis, and visualization. Conference themes have included community building, flow as a study, beaconing and distributed threats, the practical use of flow, flow in the context of other data, learning about your network, progression of analytics from ideas to prototypes to tools, and analysis at scale and perspectives.
It was soon realized that while a lot can be gleaned from the sparse data available from flow ("Exactly when did 1.5 terabytes of data leave our network headed for Antarctica?"), there is even more value that can be gained by integrating flow with other data, such as: DNS "What is the name associated with 220.127.116.11? When was it queried? When did it change?," WHOIS "Who registered a domain name, autonomous system, etc.?," geolocation data full packet capture (pcap), routing information, logs, IDS alerts, and others.
Flow is good at answering "what" and "when" questions. Answering "who," "where," and more importantly "why" questions often requires integration with other data sources. For this reason, recent presentations have included talks on topics such as Security Onion, a tool that integrates flow and numerous other data sources.
It's not clear what the next 10 years will bring, but it is clear that flow, in its many forms, will remain an important part of analyzing network traffic. Check out the FloCon proceedings to get a clear perspective on where we've been, and consider attending (and submitting your own work) to see (and drive) where we're going.
The annual FloCon conference is hosted by the CERT Division, a part of the Software Engineering Institute at Carnegie-Mellon University. Please join the FloCon LinkedIn Group to discuss FloCon or other flow related issues.