Exploratory Data Analysis and Interactive Visualizations – Winter Semester 2009/2010
Mike Sips (VIS) Arturas Mazeika (Data Mining)
Turning Prefuse into a Web Graph Exploration System
Prefuse is a set of software tools for creating interactive data
visualizations. Prefuse supports a rich set of features for data modeling,
visualization, and interaction. It provides optimized data structures for
tables, graphs, and trees, a host of layout and visual encoding techniques,
and support for animation, dynamic queries, integrated search, and database
connectivity. Examples of graph visual templates include force-directed
graphs (Figure 1(a)), visualization of tree (Figure 1(b)), radial
visualization of graphs (Figure 1(c)), treemaps (Figure 1(d));
visualizations for flowmaps for prefuse (Figure 1(e)) were proposed.
Prefuse also provides general purpose visual templates like scatterplots
(Figure 1(f)). Prefuse is a general purpose visualization toolkit.
Extensions of prefuse with additional
visualization templates as well integrated visualizations to support analytical
process are proposed.
| Figure 1: Visualization templates in prefuse |

(a) Forced-directed |

(b) Tree |

(c) Radial |

(d) Flow-map |

(e) Flow-map |

(f) Scatterplots |
The essence of the project is to develop an interactive data
visualization and analysis tool for the archives of web sites. The
development of the tool should be based on the following principles
- Interaction with the data. The web archive data primarily is
modeled with two tables in the database: the t_pages table (recording
information about the page including its url, size, last modified and
visited timestamps, status code, crawl week, etc.) and t_links table
(primarily recording information about from_url and to_url of the
page). The visualization should offer visualizations that are based on
t_pages (scatterplots), t_links (graph visualizations) or both tables.
The user should be able interactively select a subset of the data to
visualize through GUI (for example a visualization of a certain crawl
week and site). The data should be pre-fetched if needed to support
interactive visualizations.
- Visualizations and interaction between them. The tool should
offer a number of standard prefuse interactive visualizations to
visualize the data. Visualizations should be linked, i.e., change of
display of one visualization should trigger an update to the other
visualization. The tool should offer at least one graph visualization
(for example a radial visualization) and at least one non-graph
visualization (for example scatterplots).
Suggested milestones and objectives:
- Install prefuse and related extensions
- Play with the data that comes with prefuse. Try to apply the same
data for different visualizations. Thinks how different
visualizations can be used together to better explore the data
- Visualize web graph data with different visualizations
- Think what king of subsets of web data should the tool support,
implement GUI to query the data for the designed subsets
- Turn prefuse into a data exploration tool: let the explorer show
visualizations; visualizations should be linked. Update of one
visualization should trigger the update in other visualizations.