Sneha Singhania receives PhD

On Tuesday, 24 February 2026 Sneha Singhania defended her thesis with the title: “High-coverage Information Extraction from Web and Narrative Texts”. Since February 2020 she was PhD student in Computer Science at the Saarland Informatics Campus, Saarbrücken and the Max Planck Institute for Informatics under the supervision of Dr. Simon Razniewski and Prof. Gerhard Weikum, head of Department “Databases and Information Systems”. The doctoral degree is awarded by Saarland University.

Abstract of the thesis:

Information extraction (IE) transforms unstructured text into structured representations, such as subject-predicate-object triples. While prior IE methods have largely prioritized precision, many knowledge-intensive applications require high recall. This dissertation advances methods for high-recall IE across three settings. First, for webscale documents, we introduce the task of predicting document-level information coverage for relation extraction and propose a lightweight classifier that prioritizes high-recall documents under budget constraints. We further present a framework for extracting temporally grounded OpenIE propositions from evolving documents and integrating them into retrieval-augmented generation. Second, for parametric LLM knowledge, we study the multi-valued slot-filling task and formulate extraction as a rank-then-select task using predicate-specific prompting. Third, for long-form narrative texts, we introduce a two-stage framework that combines recall-oriented generation with precision-oriented scrutinization to extract long object lists. Together, by rethinking the interplay between retrieval and extraction, this thesis advances the state of high-recall IE.