News (April 2012): New version based on YAGO 2, in KIF as well as TPTP formats!
The YAGO-SUMO integration incorporates millions of entities from YAGO, which is based on Wikipedia and WordNet, into the Suggested Upper Merged Ontology (SUMO), a highly axiomatized formal upper ontology. With the combined force of the two ontologies, an enormous, unprecedented corpus of formalized world knowledge is available for automated processing and reasoning, providing information about millions of entities such as people, cities, organizations, and companies.
Compared to the original YAGO, more advanced reasoning is possible due to the axiomatic knowledge delivered by SUMO. A reasoner can conclude e.g. that a child of a human must also be a human and cannot be born before its parents, or that two people sharing the same parents must be siblings.
Copyright (c) 2008-2012 Gerard de Melo, Fabian Suchanek, Adam Pease
Available under the terms of the Creative Commons Attribution 3.0 License (CC BY 3.0).
The merge is based on the following sources:
The archives contains a number of fact files in the files in the facts subdirectory,
used by SUMO or in the TPTP format used by
many theorem provers. Among these, the
type* files contain the class hierarchy and type information.
In order to be useful, these axioms need to be used in conjunction with SUMO's upper-level axioms.
UTF-8 is used as the text encoding.
Direct mappings between YAGO and SUMO have also been released as part of the OWL version of SUMO, however this is only a tiny fraction of what is available in the YAGO-SUMO download. The SUMO.owl file should probably be saved to your local file system (opening in browser not recommended due to large file size).
For academic use, please cite the following publication:
Gerard de Melo, Fabian Suchanek and Adam Pease (2008). Integrating YAGO into the Suggested Upper Merged Ontology Proceedings of the 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2008). IEEE Computer Society, Los Alamitos, CA, USA. (BibTeX)
Detailed information about this project is available in our technical report (PDF).
How do KIF and TPTP compare to OWL?
KIF and TPTP are based on first-order and higher-order logic, and thus allows for a much richer representation of reality than OWL. For example, even quite simple axioms, e.g. that sisters of any of your parents are your aunts, cannot be expressed in OWL. SUO-KIF is a specific variant of KIF.
How can I work with KIF data?
The open source project SIGMA KEE includes a KIF parser written in Java. You do not need to setup the entire software package on your system. It suffices to import the source code into your project.
How can I work with TPTP data?
The TPTP home page links to several parsers written in Java, C++, and other languages. Additionally, most theorem provers natively support the TPTP format. Examples include Vampire, SPASS-XDB, and SPASS.
How does SUMO compare to OpenCyc?
OpenCyc does not contain the axiomatic rules of the large Cyc knowledge base, which is a commercial product that is not available for download. So, while OpenCyc is more like a taxonomy, SUMO and the YAGO-SUMO merge additionally offer a wealth of axiomatic knowledge, e.g. that two people sharing the same parents must be siblings.
How does YAGO compare to DBpedia?
YAGO and DBpedia appeared at about the same time and have somewhat similar goals. However, while YAGO focuses on providing ontological classes for every entity, DBpedia only provides links to its ontology for those entities that have an infobox on their Wikipedia page (so most entities do not have an ontological type in the DBpedia Ontology). The good news is that the two resources are compatible and can be used simultaneously.
Please get in touch with Gerard de Melo if you have further questions.