Semantic Culturomics

This project is developed jointly with the DBWeb team of Télécom ParisTech.

The last decade has seen the rise of large knowledge bases, such as YAGO, DBpedia, Freebase, or NELL. In this project, we show how this structured knowledge can help understand and mine trends in unstructured data. By combining YAGO with the archive of the French newspaper Le Monde, we can conduct analyses that would not be possible with word frequency statistics alone. We find indications about the increasing role that women play in politics, about the impact that the city of birth can have on a person's career, or about the average age of famous people in different professions. By mining commercial products from the Web, we can trace the global trade flow on a map.

Publications

Results

Temporal statistics

Mentions of men and women over time.

Average age of people per occupation.

Mentions of countries per continent.

Spatial statistics

Importance of the capital per country. In articles about countries in blue   the capital is never mentioned, whereas for countries in red  , it is always mentioned.

Percentage of foreign companies mentioned per country. For countries in blue  , only local companies are mentioned whereas for countries in red  , only foreign companies are mentioned.