A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild

Coordinator: Said Jawad Saidi

The number of IoT devices deployed within homes is increasing rapidly. It is estimated that the IoT population will increase to 20 billion by 2025. Such devices include virtual assistants, cameras, TVs, and smart home control devices. While users deploy some IoT devices explicitly, they are often unaware of the security threats and privacy consequences of using such devices. There have been notable large-scale, coordinated global attacks disrupting
large service providers. In the Mirai DDoS attack, a record-breaking attack that crippled parts of the Internet, vulnerable IP cameras were exploited as the primary weapons to launch the attack. Thus, major Internet Service Providers (ISPs) are developing strategies for dealing with the large-scale attacks from these devices. An important first step for an ISP to address the risks posed by these devices is to identify and locate these devices in the
network. While some limited solutions exist, ISPs have to overcome multiple challenges to perform efficient and accurate device discovery in a large network with millions of subscribers and 100TB+ of daily traffic. Fortunately, many ISPs already collect sampled flow statistics for their other operational purposes. A key question is whether device discovery can be done by ISPs that have access to only sampled flow data. In this project, we develop and evaluate a scalable methodology to accurately detect and monitor IoT devices at subscriber lines with limited, sparsely sampled data in-the-wild. In our methodology, we studied the destinations and the Internet infrastructure supporting the IoT devices and generated traffic signatures that can be used to identify IoT devices hosted by a subscriber in a service provider. Our findings summarised in [1] indicate that millions of IoT devices are detectable and identifiable within hours, both at a major ISP as well as an IXP, using passive, sparsely sampled network flow headers. Our methodology is able to detect devices from more than 77% of the studied IoT manufacturers, including popular devices such as smart speakers. While our methodology is effective for providing network analytics, it also highlights significant privacy consequences.


[1] S. J. Saidi, A. M. Mandalari, R. Kolcun, H. Haddadi, D. J. Dubois, D. Choffnes, G. Smaragdakis and A. Feldmann. A haystack full of needles: Scalable detection of IoT devices in the wild. In IMC’20, 20th ACM Internet Measurement Conference, Virtual Event, USA, 2020, pp. 87–100. ACM.