Internet of Things

A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild

The number of IoT devices deployed within homes is increasing rapidly. It is estimated that the IoT population will increase to 20 billion by 2025. Such devices include virtual assistants, cameras, TVs, and smart home control devices. While users deploy some IoT devices explicitly, they are often unaware of the security threats and privacy consequences of using such devices. There have been notable large-scale, coordinated global attacks disrupting large service providers. In the Mirai DDoS attack, a record-breaking attack that crippled parts of the Internet, vulnerable IP cameras were exploited as the primary weapons to launch the attack. Thus, major Internet Service Providers (ISPs) are developing strategies for dealing with the large-scale attacks from these devices. An important first step for an ISP to address the risks posed by these devices is to identify and locate these devices in the network. While some limited solutions exist, ISPs have to overcome multiple challenges to perform efficient and accurate device discovery in a large network with millions of subscribers and 100TB+ of daily traffic. Fortunately, many ISPs already collect sampled flow statistics for their other operational purposes. A key question is whether device discovery can be done by ISPs that have access to only sampled flow data. In this project, we develop and evaluate a scalable methodology to accurately detect and monitor IoT devices at subscriber lines with limited, sparsely sampled data in-the-wild. In our methodology, we studied the destinations and the Internet infrastructure supporting the IoT devices and generated traffic signatures that can be used to identify IoT devices hosted by a subscriber in a service provider. Our findings summarised in [1] indicate that millions of IoT devices are detectable and identifiable within hours, both at a major ISP as well as an IXP, using passive, sparsely sampled network flow headers. Our methodology is able to detect devices from more than 77% of the studied IoT manufacturers, including popular devices such as smart speakers. While our methodology is effective for providing network analytics, it also highlights significant privacy consequences.

[1] S. J. Saidi, A. M. Mandalari, R. Kolcun, H. Haddadi, D. J. Dubois, D. Choffnes, G. Smaragdakis and A. Feldmann. A haystack full of needles: Scalable detection of IoT devices in the wild. In IMC’20, 20th ACM Internet Measurement Conference, Virtual Event, USA, 2020, pp. 87–100. ACM.

Deep Dive into the IoT Backend Ecosystem

Internet of Things (IoT) devices are becoming increasingly ubiquitous, e.g., at home, in enterprise environments, and in production lines. To support the advanced functionalities of IoT devices, IoT vendors as well as service and cloud companies operate IoT backends—the focus of this research project. In this project we follow up on our previous work of detecting IoT devices in the wild, and propose a methodology to identify and locate them IoT backends by (a) compiling a list of domains used exclusively by major IoT backend providers and (b) then identifying their server IP addresses. Our methodology relies on a fusion of information from public documentation, passive DNS, and active measurements.

Investigators: Said Jawad Saidi, Oliver Gasser, Anja Feldmann in cooperation with Srdjan Matic (IMDEA Software Institute) and Georgios Smaragdakis (TU Delft)