SOFIE: A Self-Organizing Framework for Information Extraction

SOFIE is a system for automated ontology extension. SOFIE can parse natural language documents, extract ontological facts from them and link the facts into an ontology. SOFIE uses logical reasoning on the existing knowledge and on the new knowledge in order to disambiguate words to their most probable meaning, to reason on the meaning of text patterns and to take into account world knowledge axioms. This allows SOFIE to check the plausibility of hypotheses and to avoid inconsistencies with the ontology. The framework of SOFIE unites the paradigms of pattern matching, word sense disambiguation and ontological reasoning in one unified model.

Recently, SOFIE has been improved so as to generalize textual patterns. This new project is called PROSPERA. SOFIE and PROSPERA are part of the YAGO-NAGA project at the Max-Planck Institute for Informatics in Saarbrücken/Germany.

Please find more information below.


Download the Java source code of SOFIE (CC-BY license).