Deco
max planck institut
informatik
mpii logoMinerva of the Max Planck Society
 

Exploratory Data Analysis and Interactive Visualizations – Winter Semester 2009/2010

Mike Sips (VIS)     Arturas Mazeika (Data Mining)


Class Schedule

  • Lecture: Monday 10-12 Room 023 (Building E1.4 (Max-Planck-Institut))
  • Project/Lab Work: Friday 10.30-12 and 13-15 Room 019 (Building E1.4 (Max-Planck-Institut))
  • The class will continue on Monday November 09 with exploratory data analysis class

Aim and General Information

Graphical inventions and displays are an important external aid that help the human analyst to reveal hidden patterns in large and complex data spaces. In this class, students will get acquainted with latest research in exploratory data analysis and interactive visualization. The students will implement an own visualization technique. The goal of this class is to develop a clever and easy to use visualization that makes a difficult problem easy to solve; rather implementing a complex visualization tool.
The students will have to team-up in groups (up to three students) to implement a visualization tool. The final grade of the course will be determined based on (a) the groups ability to demonstrate the merits of their implemented visualization technique to solve the problem at hand (80 %), and (b) the knowledge of each group member of basic concepts in exploratory data analysis and interactive visualization (20%).
Groups are highly encouraged to submit their own project proposal, but each group can choose one of the project proposals listed below. In each group only students that are interested in addressing a particular problem should team-up. Our project proposals describe research problems considered currently "hot" in the visualization and data mining communities. Each project proposal has a short problem statement and a suggested plan of work. The students will need to read and digest the given papers to get experienced in the state-of-the-art. Each group will need to propose an idea of a potential visualization technique that is either (a) an extension of the state-of-the-art, or (b) might be a new and clever idea.
The class will meet twice a week for a lecture (2h), and laboratory work (4h). The lecture will discuss basic concepts of visualization, exploratory data analysis, and visual analytics, and latest research topics as well. The laboratory work includes discussion of project milestones, design decisions, and feedback. The laboratory work is meant to keep the groups on track to implement a clever solution.

Topics of the class

  • Information Visualization
  • Perception and Cognition
  • Interaction Techniques
  • Exploratory Analysis and Playing with the Data
  • Visual Analytics


Targeted Students

There are no prerequisites for the class, and the class is open to students at all levels. bachelor. However, a basic working knowledge of a graphics API (e.g. OpenGL, Java2D/3D) will be useful. The programming project can be developed using any suitable language. While these APIs, applications and languages will not be taught in the class, many introductory tutorials at the level required for the class are available on the web. The class language is English. The class has 9 credit points .


Final Grade

  • Demo + Questions My Vis Tool: 80%
  • Oral Lecture Examination: 20%


Project Proposal Submission

Groups are highly encouraged to submit their own project proposal. It is also possible to chose a project proposal listed below. All groups will be required to present their proposal for the programming project in the first laboratory assignment. There will be room for discussions and feedback. Note all groups (own + selected) are required to submit a research proposal not later than the announced deadline. We will send a notification of acceptance for both selected and own proposals.

Each proposal should include:

  • Problem Statement (150 words)
  • Why is the problem interesting? (150 words; only own proposal)
  • Sketch of the proposed solution (Initial manual drawing + short description)
  • What are the merits of your solution? (250 words)
  • Related Work (max. 5 references; only own proposal)
  • Plan of work including milestones
  • Name and skills of each group member

Deadline: postponed to October 31 2009


Suggested Projects (can be accessed within the campus network only)


(a) Information Visualization

  • Heterogeneous Data/Wikipedia
    Wikipedia is an example of large and semi-structured data set. How does visualization can help to explore this data space.
  • Million Data Items
    Today's available data sets usually have millions of data items. How to visualize millions of data items on screen spaces with limited visual capacity?
  • High Dimensional Data Spaces
    Since graphical displays are composed of two spatial variables and a limited number of visual variables the maximum number of dimensions that can be shown in anyone view is roughly 3-8. How to faithfully visualize potentially interesting features of n-D data spaces in low-dimensional views?
  • Space and Time
    Space and Time are ubiquitous. They can be visualized with space-time cubes and classical visualization techniques, but when such data sets are large, more clever techniques are needed to visualize and interact with them.
  • Mobile/Small Displays
    Too little display space for showing information with lots of details. How does a solutions to this problem could look like?
  • Value of Visualization
    Visualization can be seen as a image synthesis process. Since many parameters are involved (mapping to visual variables and properties of the data), the process usually creates a huge space of possible visualizations of the data. The challenge is: Which visualization represents the data best?


(b) Interaction Techniques


(c) Exploratory Data Analysis


Vis Competition

In addition to the final exam, students may optionally participate in a visualization competition. While grades for the projects are based mainly on "usability merit", entries in the competition will be judged on technical merit, compelling visual encoding and originality. The winner will be selected by a panel of visualization experts from industry and academia. The competition will take place at the end of the semester.


Schedule

Data used in the clases: data.tgz (76MB), data generation code (2.2MB)

Search MPII (type ? for help)