Information extraction

Advanced lecture, 5-6 ECTS credits, winter semester 2019–20

Basic Information

This advanced lecture focuses on how to construct knowledge bases using information extraction techniques. Topics will be automated information extraction using patterns, supervised extractors and open information extraction, infobox crawling, entity disambiguation and normalization, learning over knowledge bases, and their use in question answering. We will also touch upon crowdsourced KB construction, evaluation measures, and some state-of-the-art knowledge bases. In the labs, participants will step-by-step build their own knowledge base for selected toy domains, using Wikis from the Wikia fan community site as source.

Tentative schedule

 Tentative dateLectureLab 
115.10.Introduction+Knowledge representationModelling, Wikia dataset familiarization 
222.10.Structured extractionInfobox+category extraction 
312.11.NER and NEDNER+NED 
419.11.Text extractionPattern learning, OpenIE 
526.11.Taxonomy inductionHearst patterns, WordNet linking 
63.12.Rule miningExhaustive short rule evaluation 
710.12.Evaluation and use casesCrowdsourcing 
814.1.KB presentations-