b'@online{Saidi_arXiv2010.13120,'b'\nTITLE = {Exploring Network-Wide Flow Data with Flowyager},\nAUTHOR = {Saidi, Said Jawad and Maghsoudlou, Aniss and Foucard, Damien and Smaragdakis, Georgios and Poese, Ingmar and Feldmann, Anja},\nLANGUAGE = {eng},\nURL = {https://arxiv.org/abs/2010.13120},\nEPRINT = {2010.13120},\nEPRINTTYPE = {arXiv},\nYEAR = {2020},\nABSTRACT = {Many network operations, ranging from attack investigation and mitigation to<br>traffic management, require answering network-wide flow queries in seconds.<br>Although flow records are collected at each router, using available traffic<br>capture utilities, querying the resulting datasets from hundreds of routers<br>across sites and over time, remains a significant challenge due to the sheer<br>traffic volume and distributed nature of flow records.<br> In this paper, we investigate how to improve the response time for a priori<br>unknown network-wide queries. We present Flowyager, a system that is built on<br>top of existing traffic capture utilities. Flowyager generates and analyzes<br>tree data structures, that we call Flowtrees, which are succinct summaries of<br>the raw flow data available by capture utilities. Flowtrees are self-adjusted<br>data structures that drastically reduce space and transfer requirements, by 75%<br>to 95%, compared to raw flow records. Flowyager manages the storage and<br>transfers of Flowtrees, supports Flowtree operators, and provides a structured<br>query language for answering flow queries across sites and time periods. By<br>deploying a Flowyager prototype at both a large Internet Exchange Point and a<br>Tier-1 Internet Service Provider, we showcase its capabilities for networks<br>with hundreds of router interfaces. Our results show that the query response<br>time can be reduced by an order of magnitude when compared with alternative<br>data analytics platforms. Thus, Flowyager enables interactive network-wide<br>queries and offers unprecedented drill-down capabilities to, e.g., identify<br>DDoS culprits, pinpoint the involved sites, and determine the length of the<br>attack.<br>},\n}\n'