People on Drugs: Credibility of User Statements in Health Communities

Online health communities are a valuable source of information for patients and physicians. However, such user-generated resources are often plagued by inaccuracies and misinformation. In this work we propose a method for automatically establishing the credibility of user-generated medical statements and the trustworthiness of their authors by exploiting linguistic cues and distant supervision from expert sources. To this end we introduce a probabilistic graphical model that jointly learns user trustworthiness, statement credibility, and language objectivity. We apply this methodology to the task of extracting rare or unknown side-effects of medical drugs - this being one of the problems where large scale non-expert data has the potential to complement expert medical knowledge. We show that our method can reliably extract side-effects and filter out false statements, while identifying trustworthy users that are likely to contribute valuable medical information.


  • Subhabrata Mukherjee, Gerhard Weikum and Cristian Danescu-Niculescu-Mizil.
    People on Drugs: Credibility of User Statements in Health Communities.
    Proc. of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2014.


Health forum dataset used in the KDD 2014 paper: