AIDA: Accurate Online Disambiguation of Named Entities in Text and Tables

News

Stay up to date with AIDA news and releases, send a mail to: aida-news-subscribe@lists.mpi-inf.mpg.de

Overview

AIDA is a framework and online tool for entity detection and disambiguation. Given a natural-language text or a Web table, it maps mentions of ambiguous names onto canonical entities (e.g., individual people or places) registered in the YAGO2 knowledge base.

You can try AIDA on any text you like in the online demo. In case you are interested in using AIDA programmatically, the source code is available on github.com/yago-naga/aida.

To experimentally verify the quality of AIDA, we annotated nearly 1,400 newswire articles with the entities mentioned in each article. This collection is available for download (see Downloads).

Further Information

If you need any further information, please contact us via mail: NAME_OF_PROJECT@mpi-inf.mpg.de

Downloads and Datasets

The AIDA source code is available ongithub.com/yago-naga/aida. For AIDA to work, you will need to download our YAGO-based entity respository from our download page and import it into a PostgreSQL server.

 

Find all datasets related to AIDA in our downloads area.

AIDA JSON Web Service

We provide a HTTP JSON web service for AIDA so that you can try it out without any hassle of setting it up. It's as easy as:

 

curl --data text="Dylan was born in Duluth." https://gate.d5.mpi-inf.mpg.de/aida/service/disambiguate

 

More information is available in our web service description.

Please do not use it for comparison in scientific papers or for running time experiments, as the service changes continuously. If you want to compare AIDA for research, please download it and set it up on your own machines.

Publications

  • AIDA-Social: Entity Linking on the Social Stream
    Yusra Ibrahim, Mohamed Amir Yosef, Gerhard Weikum
    In: Proceedings of the 7th International Workshop on Exploiting Semantic Annotations in Information Retrieval, p. 17-19. ESAIR 2014, Shanghai, China, 2014
  • AIDArabic A Named-Entity Disambiguation Framework for Arabic Text
    Mohamed Amir Yosef, Marc Spaniol, Gerhard Weikum
    In: Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing, p. 187-195. ANLP 2014, Doha, Qatar, 2014
  • Discovering Emerging Entities with Ambiguous Names
    Johannes Hoffart, Yasemin Altun, Gerhard Weikum
    In: Proceedings of the 23rd International World Wide Web Conference, p. 385–395. WWW 2014, Seoul, South Korea, 2014
  • AIDA-light: High-Throughput Named-Entity Disambiguation
    Dat Ba Nguyen, Johannes Hoffart, Martin Theobald, Gerhard Weikum
    In: Linked Data on the Web, WWW 2014, Seoul, South Korea, 2014
  • KORE: Keyphrase Overlap Relatedness for Entity Disambiguation
    Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum
    In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, p. 545-554, CIKM 2012, Maui, USA, 2012
  • AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables   PDF
    Mohamed Amir Yosef, Johannes Hoffart, Ilaria Bordino, Marc Spaniol, Gerhard Weikum
    In: Proceedings of the 37th International Conference on Very Large Databases, VLDB 2011, p. 1450–1453, Seattle, WA, 2011
  • Robust Disambiguation of Named Entities in Text   PDF
    Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, Gerhard Weikum
    In: Conference on Empirical Methods in Natural Language Processing, p. 782–792, Edinburgh, Scotland, 2011
    For scientific works, please cite this paper

Demo

AIDA can be tested online, with different methods and configurations for entity disambiguation: AIDA Demo

AIDA Web Demo

Please use either Firefox or Chrome to view the Demo.

Results

In the EMNLP 2011 paper, the results are given on the subset of 228 CoNLL testb documents wich could be processed by all the competitor methods (1270testb, 1308testb, and 1349testb are missing). We did this for the sake of comparability. The results of our AIDA methods on all 231 CoNLL testb documents are given below. The short names are the same as in the paper.

Measuresim-kr-prior
sim-k
r-prior
sim-k
coh
r-prior
sim-k
r-coh
prior
Macro Precision@1.076.6580.8180.8682.0271.36
Micro Precision@1.076.6580.0682.2482.2966.55
AIDA results in % on all 231 CoNLL-YAGO testb documents (using the original EMNLP 2011 code)

 

 

Measuresim-kr-prior
sim-k
r-prior
sim-k
coh
r-prior
sim-k
r-coh
prior
Macro Precision@1.076.0081.0380.6781.6675.16
Micro Precision@1.076.6180.5682.0582.5470.46
AIDA results in % on all 231 CoNLL-YAGO testb documents (using the code re-engineered early 2013)