YAGO-SUMO: A Large-Scale Formal Ontology

Developers: Gerard de Melo, Fabian Suchanek, Adam Pease


The YAGO-SUMO integration incorporates millions of entities from YAGO, which is based on Wikipedia and WordNet, into the Suggested Upper Merged Ontology (SUMO), a highly axiomatized formal upper ontology. With the combined force of the two ontologies, an enormous, unprecedented corpus of formalized world knowledge is available for automated processing and reasoning, providing information about millions of entities such as people, cities, organizations, and companies.

Compared to the original YAGO, more advanced reasoning is possible due to the axiomatic knowledge delivered by SUMO. A reasoner can conclude e.g. that a child of a human must also be a human and cannot be born before its parents, or that two people sharing the same parents must be siblings.

License and Data Sources

Copyright (c) 2008-2018 Gerard de Melo, Fabian Suchanek, Adam Pease
Available under the terms of the Creative Commons Attribution 3.0 License (CC BY 3.0).

YAGO-SUMO draws on the following sources:


Download YAGO-SUMO 2008-10 in KIF format
(248 MB)
Download YAGO-SUMO 2012-04 in KIF format
(433 MB)
Download YAGO-SUMO 2012-04 in TPTP format
(563 MB)

The archives contains a number of fact files in the files in the facts subdirectory, either in SUO-KIF used by SUMO or in the TPTP format used by many theorem provers. Among these, the subClassOf* and type* files contain the class hierarchy and type information. In order to be useful, these axioms need to be used in conjunction with SUMO's upper-level axioms. UTF-8 is used as the text encoding.

Direct mappings between YAGO and SUMO have also been released as part of the OWL version of SUMO, however this is only a tiny fraction of what is available in the YAGO-SUMO download. The SUMO.owl file should probably be saved to your local file system (opening in browser not recommended due to large file size).


For academic use, please cite the following publication:

Gerard de Melo, Fabian Suchanek and Adam Pease (2008). Integrating YAGO into the Suggested Upper Merged Ontology Proceedings of the 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2008). IEEE Computer Society, Los Alamitos, CA, USA. (BibTeX)

Detailed information about this project is available in our technical report (PDF).

The conference talk slides are also available.

Frequently Asked Questions

How do KIF and TPTP compare to OWL?

KIF and TPTP are based on first-order and higher-order logic, and thus allows for a much richer representation of reality than OWL. For example, even quite simple axioms, e.g. that sisters of any of your parents are your aunts, cannot be expressed in OWL. SUO-KIF is a specific variant of KIF.

How can I work with KIF data?

The open source project SIGMA KEE includes a KIF parser written in Java. You do not need to setup the entire software package on your system. It suffices to import the source code into your project. There is also a parser written in Javascript.

How can I work with TPTP data?

The TPTP home page links to several parsers written in Java, C++, and other languages. Additionally, most theorem provers natively support the TPTP format. Examples include Vampire, SPASS-XDB, and SPASS.

How does SUMO compare to OpenCyc?

OpenCyc does not contain the axiomatic rules of the large Cyc knowledge base, which is a commercial product that is not available for download. So, while OpenCyc is more like a taxonomy, SUMO and the YAGO-SUMO merge additionally offer a wealth of axiomatic knowledge, e.g. that two people sharing the same parents must be siblings.

How does YAGO compare to DBpedia?

YAGO and DBpedia appeared at about the same time and have somewhat similar goals. However, while YAGO has always focused on providing ontological classes for every entity, DBpedia for many years only provided links to its ontology for those entities that have an infobox on their Wikipedia page (i.e., most entities did not have an ontological type in the DBpedia Ontology). The good news is that the two resources are compatible and can be used simultaneously.

Other questions

Please get in touch with Gerard de Melo if you have further questions.


Return to Main Page

© 2020 Gerard de Melo