PRAVDA: label PRopagated fAct extraction on Very large DAta
PRAVDA is a knowledge-harvesting engine, which is based on label propagation, to extract both base facts and temporal facts.
Dataset for ACL 2012:
- The corpus can be downloaded here: corpus.tar.gz.
- The extracted facts and the evaluation results reported in the experiments can be downloaded here: results.tar.gz.
- Yafang Wang, Maximilian Dylla, Zhaochun Ren, Marc Spaniol, Gerhard Weikum
PRAVDA-live: Interactive Knowledge Harvesting
Proceedings of the 21th ACM Conference on Information and Knowledge Management (CIKM 2012), Maui, Hawaii, US, October 29-November 2, 2012, pp. 2674-2676,
- Yafang Wang, Maximilian Dylla, Marc Spaniol and Gerhard Weikum:
Coupling Label Propagation and Constraints for Temporal Fact Extraction
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Republic of Korea, July 8-14, 2012, pp. 233-237.
- Yafang Wang, Bin Yang, Lizhen Qu, Marc Spaniol and Gerhard Weikum:
Harvesting Facts from Textual Web Sources by Constrained Label Propagation
Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM 2011), Glasgow, Scotland, UK, October 24-28, 2011, pp. 837-846, pdf-file.