PROSPERA: PRospering knOwledge with Scalability, PrEcision, and RecAll

PROSPERA is a Hadoop-based scalable knowledge-harvesting engine which combines pattern-based gathering of relational fact candidates with weighted MaxSat-based consistency reasoning to identify the most likely correct facts.

Publications

Source Code

Download code here: prospera.tar.gz

Experiments

  The following data is for the experiments reported in N.Nakashole et. al WSDM2011.

Experiment 1: Scalability Experiment (sports relations)

Precision/Recall per iteration

- Sports precision/recall

Extracted Facts

- Iteration 1

 

- Iteration 2

 

- Iteration 3

 

- Iteration 4

 

- Iteration 5

 

- Iteration 6

 

- PROSPERA Variant: NoReasoner

 

- PROSPERA Variant: UnWeighted

Labeled Samples

- 1 - 2 - 3 - 4 - 5 - 6 noreasoner unweighted

Constraints

- Sports Constraints

Seeds

- Sports relation seeds

 

Experiment 2: Constraints experiment (academic relations)

Precision/Recall per iteration

- Academia precision/recall

Extracted Facts

- Iteration 1

 

- Iteration 2

 

- PROSPERA Variant: NoReasoner

Labeled Samples

- 1 - 2 noreasoner

Constraints

Academic constraints

Seeds

- From YAGO ontology