D5
Databases and Information Systems

Useful negative statements

Knowledge bases (KBs) about notable entities and their properties are an important asset in applications such as search, question answering and dialogue. All popular KBs capture virtually only positive statements, and abstain from taking any stance on statements not stored in the KB. In this work, we make the case for explicitly stating salient statements that do not hold. Negative statements are useful to overcome limitations of question answering, and can often contribute to informative summaries of entities. Due to the abundance of such invalid statements, any effort to compile them needs to address ranking by saliency.

Publications

  • Wikinegata: a KB with Interesting Negative Statements.
    Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan
    VLDB 2021 - [PDF] [VIDEO]
  • Negative Knowledge for Open-world Wikidata.
    Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan
    WWW Companion 2021 - [PDF] [LINK]
  • Negative statements considered useful.
    Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan
    JWS 2021  [PDF], [LINK]
  • Enriching KBs with interesting negative statements.  
    Hiba Arnaout, Simon Razniewski, Gerhard Weikum
    AKBC 2020 [PDF], [LINK]

Demo - Wikinegata

A web browsing interface: discovering interesting negations about Wikidata entities.


Visit demo !

 

For latest news and fun negations, follow us on Twitter!.

Datasets

''I did not know that!''

EntityNegationContextualized Verbalization
Abraham Lincoln¬(death cause, natural)Unlike the previous 17 U.S. presidents.
Jeff Bezos¬(occupation, politician)Unlike the previous 17 of 21 Time Person of the Year winners
Angela Merkel¬(gender, male)Unlike the previous 6 Chairmen of the CDU.
Paul McCartney¬(citizenship, U.S.A.)Unlike the previous 35 of 41 Grammy Lifetime Achievement Award winners.

12.5M salient negations- [LINK] **
[Format: ORDERED_GROUP_ID[tab]ORDERED_GROUP_LABEL[tab]S_ID[tab]S_LABEL[tab]RANK[tab]P_ID;O_ID[tab]P_LABEL;O_LABEL[tab]SCORE[tab]EXPLANATION]
For more details about the creation of this dataset, refer to Section 5 in: Arnaout, H.; Razniewski, S.; Weikum, G.; and Pan, J. Z. 2021. Negative statements considered useful. Journal of Web Semantics 71:100661.

 

Additional datasets:

  •    1.4M useful negative statements about the most popular 130K people, organizations, and literature work. - [HERE] *
    [Format: S_ID[tab]S_LABEL[tab]P_LABEL[tab]P_LABEL[tab]O_ID[tab]O_LABEL[tab]SCORE]
    Methodology used: peer-based inference (similarity-based) method.
  •    12.5M useful negative statements about 545K entities from various types (mostly people, literature work, and sport events)- [HERE] **
    [Format: ORDERED_GROUP_ID[tab]ORDERED_GROUP_LABEL[tab]S_ID[tab]S_LABEL[tab]RANK[tab]P_ID;O_ID[tab]P_LABEL;O_LABEL[tab]SCORE[tab]EXPLANATION]
    Methodology used: order-oriented inference method.
  •    40K ordered peer sets- [HERE]
    [Format: ORDERED_GROUP_ID[tab]ORDERED_GROUP_LABEL[tab]ENTITIES_IN_GROUP_i[tab]ENTITIES_IN_GROUP_i_LABEL]
    ENTITIES_IN_GROUP_i = set_1|set_2|etc = e1;e2;e3|e4|e5;e6|etc
    Description: order-oriented peer groups created as inputs for dataset **.
  •    6.2K useful negative statements about the most popular 2.4K people- [HERE]
    [Format: S_ID[tab]S_LABEL[tab]NEG_STATEMENT]
    Methodology used: pattern-based query log extraction method.
  •    1K mturk-annotated negative statements - [HERE]
    [Format: ROW_ID[tab]S_ID[tab]S_LABEL[tabl]S_TYPE[tab]CORRECTNESS_LABEL], retrived from *.
    Correctness assessment: correct, incorrect, not sure. 

Find useful negations in your datasets

We make our peer-based inference method public, for users to try it on their own tabular datasets.

 

Visit our Github repository!

 

We provide three sample datasets and their useful negations, on: Turing award winners, U.S. presidents, and hotels in India.