NAGA: Searching and Ranking Knowledge

NAGA is a new semantic search engine. It can operate on knowledge bases that are organized as graphs with labeled nodes and edges, so called relationship graphs. As of now, NAGA uses a projection of YAGO as its knowledge base. The underlying query language supports keyword search for the casual user as well as graph-based queries with regular expressions for the expert user. One can think of a graph-based query as a labaled graph in which some of the node or edge labels are missing. In order to retrieve answers to a graph-based query, NAGA searches for subgraphs in the knowledge base that match the query strcture and its labels and bind the missing labels in the query. In general, there may be hundreds or even thousands of answers to a given query. The goal is to rank the retrieved answers in such a way that the most important answers are ranked first. For example, for the query that asks for people who were physicists, a good ranking model should give preference to important physicists, such as Albert Einstein, Max Planck, and the like. To achieve this, NAGA's ranking framework formalizes several notions like confidence (the belief in the accurracy of the answer), informativeness (the importance of the answer for the given query), and compactness (the connectivity strength between the entities in the answer). To formalize and combine these notions, we have applied the principles of statistical language models (which are successfully used in document-centric information retrieval) to the unexplored setting of labeled graphs. For more information please refer to our publication: NAGA: Searching and Ranking Knowledge.

If you have any questions about the project, please contact Gjergji Kasneci.

People

  • Weikum, Gerhard
  • Elbassuoni, Shady
  • Ifrim, Georgiana
  • Kasneci, Gjergji
  • Sozio, Mauro
  • Suchanek, Fabian

Publications

Demo

Querying

A query has the form

E1 R1 E2
E3 R2 E4
...

where the Ei's are entities (e.g. Albert_Einstein) or variables (e.g. $x) and the Rj's are relations (e.g. bornOnDate) or variables (e.g. $y). The following relations are allowed:

isCalled, type, subClassOf, domain, range, subPropertyOf, familyNameOf, givenNameOf, describes, establishedOnDate, hasWonPrize, writtenInYear, locatedIn, politicianOf, sameYear, isCitizenOf, isMereonymOf, isMemberOf, isSubstanceOf, isPartOf, foundIn, discovered, discoveredOnDate, bornOnDate, bornIn, originatesFrom, diedOnDate, diedIn, isNativeNameOf, isLeaderOf, hasArea, hasPopulation, hasPopulationDensity, hasUTCOffset, hasWebsite, isNumber, hasPredecessor, hasSuccessor, isMarriedTo, isAffiliatedTo, influences, directed, produced, edited, actedIn, publishedOnDate, hasDuration, hasProductionLanguage, hasBudget, hasImdb, producedIn, hasChild, hasMotto, hasOfficialLanguage, hasCapital, hasWaterPart, hasGDPPPP, hasNominalGDP, hasHDI, hasGini, hasCurrency, inTimeZone, hasTLD, hasCallingCode, wrote, madeCoverFor, isOfGenre, published, hasPages, hasISBN, joined, livesIn, hasHeight, hasWeight, isSpokenIn, created, createdOnDate, interestedIn, hasEconomicGrowth, hasInflation, hasPoverty, hasLabor, hasUnemployment, hasExport, exports, hasImport, imports, dealsWith, hasRevenue, hasExpenses, hasMilitary, hasNumberOfPeople, usesGDP, worksAt, graduatedFrom, hasAcademicAdvisor, happenedOnDate, happenedIn, participatedIn, musicalRole, hasProduct, connect, isa, sameYear.

The R's can also be regular expressions over these relations. Both the relations and the entities can be variables (starting with $). The two relations connect and isa are pseudo-relations that only work infix in a plain line. You may also click on the results to browse through the ontological information.

<iframe src="http://uniat5401.ag5.mpi-sb.mpg.de:8180/webyago/WebInterfaceNaga" name="Fensterlein" width="500" height="1000" marginheight="10" marginwidth="10" align="right"> Your browser cannot show the embeded Frames! &lt;o:p&gt;&lt;/o:p&gt; </iframe>

Example queries

Simple queries

  • Planck familyNameOf $x
    (try your own name...)

 

  • Albert_Einstein type subClassOf* $x

 

  • Zidane isa $x

 

  • $x wrote $y; $y type $z

 

  • $x isa scientist; $x isa politician

 

  • Max_Planck bornOnDate $x
    $y bornOnDate $z
    $z sameYear $x

 

  • $x isa humanist
    $x discovered $y

 

Connect queries

  • Niels_Bohr connect Albert_Einstein

 

  • $x type wikicategory_Italian_physicists
    $x connect Albert_Einstein

    (entity names can be written in quotes (e.g. "Albert Einstein"))

 

  • for more than 2 entities you can use the following:
    connect(Albert_Einstein, Niels_Bohr, Tom_Cruise)
    (again entity names can be written in quotes (e.g. "Albert Einstein"))

 

Precautions

  • This interface is still experimental! It allows only a certain number of users. If you post a time-expensive query and too many users query NAGA in the meantime, your query may be canceled.
  • Please be aware that queries might require huge cross-product computations. If a query uses e.g. the word "<st1:City w:st="on"><st1:place w:st="on">Paris</st1:place></st1:City>" and the constraint "isa physicist", then the NAGA engine has to check all 71 meanings of Paris it knows and the thousands of physicists it knows, resulting in long query times.
  • Try putting the most constraining lines first!
  • Please use lowercase for common nouns (e.g. to ask for the meanings of "bank", ask "bank means $what")


Contact

If you encounter a problem with NAGA, if you have a suggestion or a question, send a mail to Gjergji Kasneci.