Knowledge bases (KBs) about notable entities and their properties are an important asset in applications such as search, question answering and dialogue. All popular KBs capture virtually only positive statements, and abstain from taking any stance on statements not stored in the KB. In this work, we make the case for explicitly stating salient statements that do not hold. Negative statements are useful to overcome limitations of question answering, and can often contribute to informative summaries of entities. Due to the abundance of such invalid statements, any effort to compile them needs to address ranking by saliency.
- 1.4M useful negative statements about the most popular 130K people, organizations, and literature work. - [HERE] *
Methodology used: peer-based inference (similarity-based) method.
- 12.5M useful negative statements about 545K entities from various types (mostly people, literature work, and sport events). - [HERE] **
Methodology used: order-oriented inference method.
- 40K ordered peer sets. - [HERE]
ENTITIES_IN_GROUP_i = set_1|set_2|etc = e1;e2;e3|e4|e5;e6|etc
Description: order-oriented peer groups created as inputs for dataset **.
- 6.2K useful negative statements about the most popular 2.4K people. - [HERE]
Methodology used: pattern-based query log extraction method.
- 1K mturk-annotated negative statements - [HERE]
[Format: ROW_ID[tab]S_ID[tab]S_LABEL[tabl]S_TYPE[tab]CORRECTNESS_LABEL], retrived from *.
Correctness assessment: correct, incorrect, not sure.
We make our peer-based inference method public, for users to try it on their own tabular datasets.
Visit our Github repository!
We provide three sample datasets and their useful negations, on: Turing award winners, U.S. presidents, and hotels in India.