R-Susceptibility

Privacy  of  Internet  users  is  at  stake  because  they  expose personal information in posts created in online communities, in  search  queries,  and  other  activities.   An  adversary  that monitors a community may identify the users with the most sensitive properties and utilize this knowledge against them (e.g., by adjusting the pricing of goods or targeting ads of sensitive  nature).   Existing  privacy  models  for  structured data are inadequate to capture privacy risks from user posts. This paper presents a ranking-based approach to the as- sessment of privacy risks emerging from textual contents in online communities, focusing on sensitive topics, such as being depressed.  We propose ranking as a means of modeling a  rational  adversary  who  targets  the  most  aicted  users. To  capture  the  adversary's  background  knowledge  regard- ing  vocabulary  and  correlations,  we  use  latent  topic  mod- els.  We cast these considerations into the new model of R- Susceptibility, which can inform and alert users about their potential for being targeted, and devise measures for quantitative risk assessment.  Experiments with real-world data show the feasibility of our approach.