A | B | C | D | E | F 
 G | H | I | J | K | L | M 
 N | O | P | Q | R | S | T 
 U | V | W | X | Y | Z 
max planck institut
informatik
mpii logoMinerva of the Max Planck Society

Homepage

Dietz, Laura

Laura Dietz

Max-Planck-Institut für Informatik
D5: Databases and Information Systems
Campus E1.4, Room 429
66123 Saarbrücken
Germany

Email: Get my email address via email
Phone: +49 681 9325 529
Fax: +49 681 9325 599
PGP ID: 0xF5F7017F


Research Interests


My research is based on probabilistic modeling for unsupervised or weakly supervised ranking tasks. Within my thesis, I devise models for three use cases on which black-box approaches fail: 

a) Identifying influencial publications in citation networks. [citation] [pdf] [talk] [project page]

b) Predicting defective code statements when debugging software. [citation] [pdf] [supplement] [project page]

c) Unsupervised recommendation of songs based on user-user and user-item interactions in music sharing online communities such as Zune Social. [citation] [pdf]


Publications


[Die09] Dietz, Laura: Modeling Shared Tastes in Online Communities. In: NIPS Workshop on Applications for Topic Models: Text and Beyond
[ .pdf  ]
Many social networking platforms track user-item interaction and friend relationships between users. The goal of this work is to learn about tastes that are shared by friends. We devise a probabilistic model in which tastes explain both friend relationships and item interactions as latent factors. After estimation, shared tastes can be used to predict common preferences of befriended users, predict further item interactions, and give an overview about the network of users that share the same taste.
[DDZS09] Dietz, Laura ; Dallmeier, Valentin ; Zeller, Andreas ; Scheffer, Tobias: Localizing Bugs in Program Executions with Graphical Models. In: Advances in Neural Information Processing Systems
[ bib | .pdf |  supplement ]
We devise a graphical model that supports the process of debugging software by guiding developers to code that is likely to contain defects. The model is trained using execution traces of passing test runs; it reflects the distribution over transitional patterns of code positions. Given a failing test case, the model determines the least likely transitional pattern in the execution trace. The model is designed such that Bayesian inference has a closed-form solution. We evaluate the Bernoulli graph model on data of the software projects AspectJ and Rhino.
[DD08] Dietz, Laura ; Dallmeier, Valentin: Probabilistic Graph Models for Debugging Software. In: Proceedings of NIPS 2008 Workshop on Analyzing Graphs: Theory and Applications
[ bib | .html | .pdf ]
Of all software development activities, debugging locating the defective source code statements that cause a failure—can be by far the most time-consuming. We employ probabilistic modeling to support programmers in finding defective code. Most defects are identifiable in control flow graphs of software traces. A trace is represented by a sequence of code positions (line numbers in source filenames) that are executed when the software runs. The control flow graph represents the finite state machine of the program, in which states depict code positions and arcs indicate valid follow up code positions. In this work, we extend this definition towards an n-gram control flow graph, where a state represents a fragment of subsequent code positions, also referred to as an n-gram of code positions. We devise a probabilistic model for such graphs in order to infer code positions in which anomalous program behavior can be observed. This model is evaluated on real world data obtained from the open source AspectJ project and compared to the well known multinomial and multi-variate Bernoulli model.
[DB07] Dietz, Laura ; Bickel, Steffen: Modeling Evolution of Ideas in the Web of Science. In: NIPS Workshop on Statistical Network Models
[ bib | .pdf ]
When reading on a new topic researchers need to get a quick overview about a research area. Especially when reading a publication in an unfamiliar area, the reader is interested in how this publication is embedded in the web of science. The goal is to give the reader an overview of the evolution of ideas leading to the examined publication. We want to analyze this evolution by inferring and visualizing the strength of topical similarity between linked publications in the neighborhood of the initially examined work.
[Die07] Dietz, Laura; Bickel, Steffen;Scheffer Tobias : Unsupervised Prediction of Citation Influences. In: Proceedings of the 24th International Conference on Machine Learning. Corvallis, Oregon, USA, June 2007
[.pdf ] -- Watch the Talk -- Try out the Citation Influence Browser
Publication repositories contain an abundance of information about the evolution of scientific research areas. We address the problem of creating a visualization of a research area that describes the flow of topics between papers, quantifies the impact that papers have on each other, and helps to identify key contributions. To this end, we devise a probabilistic topic model that explains the generation of documents; the model incorporates the aspects of topical innovation and topical inheritance via citations. We evaluate the model's ability to predict the strength of influence of citations against manually rated citations.

ICML 2007 - PRELIMINARY VIDEOS FROM THE SPOT

[Die06b] Dietz, Laura: Probabilistic Topic Models to Support a Scientific Community (DC-Poster). In: Poster at the Doctoral Consortium of the The Twenty-First National Conference on Artificial Intelligence (AAAI06), 2006
[ bib ]
Excellent research requires a good understanding of the community according to publications, researchers and topics in order to quickly find potential research gaps as well as cooperation partners for specific tasks. Such an effective community support is especially required by researchers that are new to the field as well as in cross disciplinary areas.

The contribution of my thesis will be a set of probabilistic models and algorithms for automatic recognition of thematic aspects in relations of actors and publications. Unsupervised data-driven algorithms are required to allow for application by the end user without expert knowledge.

One example is the recognition of the topical aspect in a cites relation, which aims to distinguish background literature from publications that had a significant impact on the citing document. This allows to give historiographic details on the flow of ideas within the research field. Another example is the clarification of topical roles of co-authors in a joint authoring process which will give rise to topic based social network analysis.

[DS06] Dietz, Laura ; Stewart, Avare: Utilize Probabilistic Topic Models to Enrich Knowledge Bases. In: In Proceedings of the ESWC 2006 Workshop on Mastering the Gap: From Information Extraction to Semantic Representation
[ bib | .pdf ]
Abstract. In publication driven domains such as the scientific community the availability of topic information in the form of a taxonomy and associated publications is essential. State-of-the-art methods for topic extraction in the Semantic Web community either need high manual effort (e.g. when using categorization) or rely on error prone techniques such as hierarchical clustering. Abstract. We present an alternative solution that uses probabilistic topic models, a technique for unsupervised topic extraction based on statistical inference. The topic model can autonomously perform tasks that require massive data processing; such as identifying topics and associations of publications to multiple topics. Only for tasks requiring intellectual activity and for which no reliable automated techniques are available, is the user is asked for assistance. Abstract. In this work we explicate how the results of the topic model are stored in a knowledge base for later reuse. It is described how the stored information can be interpreted to provide diagnostic support for the manual topic refinement. We deliniate how the extracted topic information can be exploited in an community service application for the end user.
[Die06a] Dietz, Laura: Exploring Social Topic Networks with the Author-Topic Model. In: Workshop Semantic Network Analysis at European Semantic Web Conference '06. Budva, Montenegro, June 2006, 54-60
[ bib | .pdf ]
Most approaches in Social Network Analysis have ignored the contents of collaboration artifacts. The goal of this work is to com- bine social network analyses with probabilistic topic models for identi- fying sub-communities with non-exclusive membership based on the co- authorship relation as well as topical similarity. The results of this work is integrated in the VIKE-Framework to support researchers in exploring scientific communities.
[Die05] Dietz, Laura: Stimulating Massively Multiplayer Cooperation with Co-located Game Concepts. In: Workshop Gaming Applications in Pervasive Computing Environments at Pervasive '05. Munich, Germany, May 2005
[ bib ]
[TD05] Tandler, Peter ; Dietz, Laura: Cooperation in Ubiquitous Computing: An Extended View on Sharing. In: Hemmje, Matthias (Hrsg.) ; Niederee, Claudia (Hrsg.) ; Risse, Thomas (Hrsg.): From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments Bd. 3379, Springer, 2005, S. 241-250
[ bib ]
[PDT04] Pape, Sebastian ; Dietz, Laura ; Tandler, Peter: Single Display Gaming: Examining Collaborative Games for Multi-User Tabletops. In: Workshop on Gaming Applications in Pervasive Computing Environments at Pervasive '04. Vienna, Austria, 2004
[ bib ]
[DDFW03a] Dawabi, Peter ; Dietz, Laura ; Fernandez, Alejandro ; Wessner, Martin: ConcertStudeo: Using PDAs to support face-to-face learning. In: Wasson, B. (Hrsg.) ; Baggetun, R. (Hrsg.) ; Hoppe, U. (Hrsg.) ; Ludvigsen, S. (Hrsg.): International Conference on Computer Support for Collaborative Learning 2003 - Community Events. Bergen, Norway, June 2003, S. 235-237
[ bib ]
[DDW03] Dawabi, Peter ; Dietz, Laura ; Wessner, Martin: Elektronisches Lernen mit Taschencomputern - Neue Technologie oder neue Didaktik? In: Magazin LOGIN (2003), Nr. 125, S. 17-20
[ bib ]
[DDFW03b] Dietz, Laura ; Dawabi, Peter ; Fernandez, Alejandro ; Wessner, Martin: Improving Face-to-Face Learning with ConcertStudeo. In: Learning Technology 5 (2003), Nr. 2
[ bib ]

Teaching


WS 2009/2010: Tutor of Information Retrieval and Data Mining (Prof. Weikum), Saarland University, Saarbruecken.

WS 2007/2008: Seminar on Link Mining (Prof. Scheffer), Humboldt University, Berlin.


Recent Positions



Education



Links




Private


Hobbies

Search MPII (type ? for help)