Decoration
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

Research


Research Areas

We work on machine learning and applications in the areas of information retrieval and language processing. Currently, our work focuses on the following research topics.



Funded Research Projects

Mining Jazz Data to Assess Development Processes

Funding: IBM, Jazz Faculty Grant
Duration: 01/2008-12/2008
Principal Investigators: Andreas Zeller, Tobias Scheffer

What is it that makes a good development process? We want to develop a plug-in that learns from collaboration and defect data as tracked by Jazz, relates features of the collaborative development process to the defect density of individual components, and thereby automatically predicts code quality. For instance, the plug-in might advise that package P should be reviewed more, because a new dependency on compiler internals has been added shortly before the release date by a developer who is new to the team.

Server-Side Security

Funding: Strato Rechenzentrum AG
Duration: since 7/2005
Project Members: Michael Brückner, Uwe Dick, Peter Haider.

Strato is a European provider of webspace and server hosting services. We analyze the adversarial classification problem of spam identification. Spam filtering is a game between two opponents, spam sender and spam filter, that react to each other's moves. We seek to identify a winning strategy that cannot easily be dodged by spam senders. In cooperation with Strato AG, we have developed a spam filter that now processes roughly 1 percent of all emails sent and received worldwide.

Intrusions are attempted on a daily basis. Usually, attackers seek to exploit insecure web sites in order to send huge amounts of spam emails via Strato's email servers. We develop an intelligent monitoring system that tracks http requests and discriminates ligitimate use of a web site from attempts to exploit insecure scripts.

Data and Text Mining in Quality and Service

Funding: DaimlerChrysler AG
Duration: 08/2005-07/2008
Project Members: Sascha Schulz

We study the problem of discovering trends and new developments in production and warranty databases as well as in workshop reports. We develop technologies that automatically identify such trends and discover their hidden causes. The goal of this project is the constructive analysis of data mining methods that lead to improved service processes by integrating and analyzing textual information and data from multiple, heterogeneous and distributed databases.

Modelling and Optimization of Dialysis Treatment

Funding: Fresenius-affiliate NephroCare e-Services GmbH
Duration: since 04/2008
Project Members: Jochen Fischer, Manuel Stritt

We investigate model-building and the generation of actionable knowledge from records of dialysis treatments.

Scalable Ranking of Online Ads

Funding: nugg.ad AG
Duration: since 02/2007
Project Members: Christoph Sawade, Arvid Terzibaschian

In this project, we investigate efficient algorithms that predict which ad a user is most likely to click at, based on that user's past clicking behavior and all other information that is available.

Text Mining: Knowledge Discovery in Text Databases and Efficient Document Processing

(German project title: Text Mining: Wissensentdeckung in Textsammlungen und Effizienz von Dokumentenverarbeitungsprozessen)

Funding: German Science Foundation DFG
Duration: June 2003 through June 2008
Project Members: Steffen Bickel, Ulf Brefeld.

The amount of documents available in archives and on the web is growing exponentially. This growth induces a demand for methods that automatically analyze large volumes of documents, discover and utilize valuable knowledge contained in them. A substantial part of our working processes consists of processing (i.e., reading, writing, manipulating) documents. Many tools support the administration of text documents, such as file systems, databases, or document management systems. Much greater efforts (and more expenses), however, are imposed by the actual document manipulation processes — such as writing documents. Any support of document manipulation processes requires substantial knowledge; it is therefore much more difficult to support document processing rather than document administration.
The goal of the „Text Mining“ project is to develop and study text mining algorithms that discover knowledge in large document archives, and utilize this knowledge to support future text manipulation processes.

Search MPII (type ? for help)