We propose methods for inferring personal knowledge, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.
Anna Tigunova, Andrew Yates, Paramita Mirza and Gerhard Weikum. Listening between the Lines: Learning Personal Attributes from Conversations. In Proceedings of The Web Conference (WWW) 2019, pages 1818-1828, San Francisco, CA, United States. [pdf]
Code and data
Dataset of characters in popular movies labeled with profession, age and gender attributes.