Max-Planck-Institut
für Informatik
D5: Databases
and Information Systems
Campus
E1.4, Room 429
66123 Saarbrücken
Germany
Email: Get
my email address via email
Phone:
+49 681 9325 529
Fax: +49 681 9325 599
PGP ID: 0xF5F7017F
My research is based on probabilistic modeling for unsupervised or weakly supervised ranking tasks. Within my thesis, I devise models for three use cases on which black-box approaches fail:
a) Identifying influencial publications in citation networks. [citation] [pdf] [talk] [project page]
b) Predicting defective code statements when debugging software. [citation] [pdf] [supplement] [project page]
c) Unsupervised recommendation of songs based on user-user and user-item interactions in music sharing online communities such as Zune Social. [citation] [pdf]
| [Die09] |
Dietz, Laura: Modeling Shared Tastes in Online Communities. In:
NIPS Workshop on Applications for Topic Models: Text
and Beyond [ .pdf ] Many social networking platforms track user-item interaction and friend relationships between users. The goal of this work is to learn about tastes that are shared by friends. We devise a probabilistic model in which tastes explain both friend relationships and item interactions as latent factors. After estimation, shared tastes can be used to predict common preferences of befriended users, predict further item interactions, and give an overview about the network of users that share the same taste. |
| [DDZS09] |
Dietz, Laura ; Dallmeier, Valentin ; Zeller, Andreas ; Scheffer,
Tobias: Localizing Bugs in Program Executions with Graphical Models.
In:
Advances in Neural Information Processing Systems [ bib | .pdf | supplement ] We devise a graphical model that supports the process of debugging software by guiding developers to code that is likely to contain defects. The model is trained using execution traces of passing test runs; it reflects the distribution over transitional patterns of code positions. Given a failing test case, the model determines the least likely transitional pattern in the execution trace. The model is designed such that Bayesian inference has a closed-form solution. We evaluate the Bernoulli graph model on data of the software projects AspectJ and Rhino. |
| [DD08] |
Dietz, Laura ; Dallmeier, Valentin: Probabilistic Graph Models for
Debugging Software. In: Proceedings of NIPS 2008 Workshop on
Analyzing Graphs: Theory and Applications [ bib | .html | .pdf ] Of all software development activities, debugging locating the defective source code statements that cause a failure—can be by far the most time-consuming. We employ probabilistic modeling to support programmers in finding defective code. Most defects are identifiable in control flow graphs of software traces. A trace is represented by a sequence of code positions (line numbers in source filenames) that are executed when the software runs. The control flow graph represents the finite state machine of the program, in which states depict code positions and arcs indicate valid follow up code positions. In this work, we extend this definition towards an n-gram control flow graph, where a state represents a fragment of subsequent code positions, also referred to as an n-gram of code positions. We devise a probabilistic model for such graphs in order to infer code positions in which anomalous program behavior can be observed. This model is evaluated on real world data obtained from the open source AspectJ project and compared to the well known multinomial and multi-variate Bernoulli model. |
| [DB07] |
Dietz, Laura ; Bickel, Steffen: Modeling Evolution of Ideas in the Web
of Science. In: NIPS Workshop on Statistical Network Models [ bib | .pdf ] When reading on a new topic researchers need to get a quick overview about a research area. Especially when reading a publication in an unfamiliar area, the reader is interested in how this publication is embedded in the web of science. The goal is to give the reader an overview of the evolution of ideas leading to the examined publication. We want to analyze this evolution by inferring and visualizing the strength of topical similarity between linked publications in the neighborhood of the initially examined work. |
| [Die07] |
Dietz, Laura; Bickel, Steffen;Scheffer Tobias :
Unsupervised Prediction of Citation Influences.
In: Proceedings of the 24th International
Conference on Machine Learning.
Corvallis, Oregon, USA, June 2007 [.pdf ] -- Watch the Talk -- Try out the Citation Influence Browser Publication repositories contain an abundance of information about the evolution of scientific research areas. We address the problem of creating a visualization of a research area that describes the flow of topics between papers, quantifies the impact that papers have on each other, and helps to identify key contributions. To this end, we devise a probabilistic topic model that explains the generation of documents; the model incorporates the aspects of topical innovation and topical inheritance via citations. We evaluate the model's ability to predict the strength of influence of citations against manually rated citations. ICML 2007 - PRELIMINARY VIDEOS FROM THE SPOT |
| [Die06b] |
Dietz, Laura: Probabilistic Topic Models to Support a Scientific
Community (DC-Poster). In: Poster at the Doctoral Consortium
of the The Twenty-First National Conference on Artificial Intelligence
(AAAI06), 2006 [ bib ] Excellent research requires a good understanding of the community according to publications, researchers and topics in order to quickly find potential research gaps as well as cooperation partners for specific tasks. Such an effective community support is especially required by researchers that are new to the field as well as in cross disciplinary areas. |
| [DS06] |
Dietz, Laura ; Stewart, Avare: Utilize Probabilistic Topic Models to
Enrich Knowledge Bases. In: In Proceedings of the ESWC 2006
Workshop on Mastering the Gap: From Information Extraction to Semantic
Representation [ bib | .pdf ] Abstract. In publication driven domains such as the scientific community the availability of topic information in the form of a taxonomy and associated publications is essential. State-of-the-art methods for topic extraction in the Semantic Web community either need high manual effort (e.g. when using categorization) or rely on error prone techniques such as hierarchical clustering. Abstract. We present an alternative solution that uses probabilistic topic models, a technique for unsupervised topic extraction based on statistical inference. The topic model can autonomously perform tasks that require massive data processing; such as identifying topics and associations of publications to multiple topics. Only for tasks requiring intellectual activity and for which no reliable automated techniques are available, is the user is asked for assistance. Abstract. In this work we explicate how the results of the topic model are stored in a knowledge base for later reuse. It is described how the stored information can be interpreted to provide diagnostic support for the manual topic refinement. We deliniate how the extracted topic information can be exploited in an community service application for the end user. |
| [Die06a] |
Dietz, Laura: Exploring Social Topic Networks with the Author-Topic
Model. In: Workshop Semantic Network Analysis
at European Semantic Web Conference '06. Budva, Montenegro,
June 2006, 54-60 [ bib | .pdf ] Most approaches in Social Network Analysis have ignored the contents of collaboration artifacts. The goal of this work is to com- bine social network analyses with probabilistic topic models for identi- fying sub-communities with non-exclusive membership based on the co- authorship relation as well as topical similarity. The results of this work is integrated in the VIKE-Framework to support researchers in exploring scientific communities. |
| [Die05] |
Dietz, Laura: Stimulating Massively Multiplayer Cooperation with
Co-located Game Concepts. In: Workshop Gaming
Applications in Pervasive Computing Environments at
Pervasive '05. Munich, Germany, May 2005 [ bib ] |
| [TD05] |
Tandler, Peter ; Dietz, Laura: Cooperation in Ubiquitous Computing: An
Extended View on Sharing. In: Hemmje, Matthias (Hrsg.) ; Niederee,
Claudia (Hrsg.) ; Risse, Thomas (Hrsg.): From Integrated
Publication and Information Systems to Virtual Information and
Knowledge Environments Bd. 3379, Springer, 2005, S. 241-250 [ bib ] |
| [PDT04] |
Pape, Sebastian ; Dietz, Laura ; Tandler, Peter: Single Display Gaming:
Examining Collaborative Games for Multi-User Tabletops. In: Workshop
on Gaming Applications in Pervasive Computing Environments at Pervasive
'04. Vienna, Austria, 2004 [ bib ] |
| [DDFW03a] |
Dawabi, Peter ; Dietz, Laura ; Fernandez, Alejandro ; Wessner, Martin:
ConcertStudeo: Using PDAs to support face-to-face learning. In: Wasson,
B. (Hrsg.) ; Baggetun, R. (Hrsg.) ; Hoppe, U. (Hrsg.) ; Ludvigsen, S.
(Hrsg.): International Conference on Computer Support for
Collaborative Learning 2003 - Community Events. Bergen,
Norway, June 2003, S. 235-237 [ bib ] |
| [DDW03] |
Dawabi, Peter ; Dietz, Laura ; Wessner, Martin: Elektronisches Lernen
mit Taschencomputern - Neue Technologie oder neue Didaktik? In: Magazin
LOGIN (2003), Nr. 125, S. 17-20 [ bib ] |
| [DDFW03b] |
Dietz, Laura ; Dawabi, Peter ; Fernandez, Alejandro ; Wessner, Martin:
Improving Face-to-Face Learning with ConcertStudeo. In: Learning
Technology 5 (2003), Nr. 2 [ bib ] |
WS 2009/2010: Tutor of Information Retrieval and Data Mining (Prof. Weikum), Saarland University, Saarbruecken.
WS 2007/2008: Seminar on Link Mining (Prof. Scheffer), Humboldt University, Berlin.
Implementation of Latent Dirichlet Allocation (via Gibbs
Sampling)
LdaWrapper.jar
[executable jar, libs included]
LdaWrapper-src.jar
[source]
LdaWrapper.info.txt
[info]