Summer 2003: Algorithm Library Design
Projects
The course requires a project for grading (40%). The project is
chosen on mutual agreement with the lecturers. The student can propose
a project or the student can choose from the suggested projects
below. The project could be, for example, redesigning some part of an
existing system or another project (thesis work etc.) into a library
module. Some projects are intentionally "open end" and can be extended
to a FoPra, bachelor, master or diploma thesis at the end of the
course.
After four weeks (Tuesday, 20.05.03), a short presentation of the
project is expected in class. The student presents the problem, the
objectives that the student wishes to address with the design, and the
first plans for the design and its realization. At the end of the term
(Thursday, 24.07.03) a final presentation is expected. We plan to add
further milestones for the project to ensure a steady progress.
The projects have to be realized in C++. We suggest teamwork, and
if several students pick the same project, we insist on teams. Teams
can consist of two or three students. Two teams can share the same
project. Students that come later than the first two teams of the same
project have to select a different project.
Suggestions for Projects
No previous knowledge in the domain is required.
-
Exact String Matching
[Kärkkäinen]
-
Compact Container for Graphs [Pion]
-
C++ allocators based on the design
of vmalloc
[Kärkkäinen]
-
Polygon Triangulation [Kettner]
-
Bucketing/Grid Data Structure [Pion,Kettner]
-
Binary Space Partition (BSP) Tree Data Structure [Pion,Kettner]
-
Priority Queues: generative programming with different requirements
on the priority queue, such as monotone priority queue, existence of
a decrease-key function etc. [Kärkkäinen]
-
Radix Sorting: various algorithms that all work on a generic type,
e.g. integers or doubles. [Kärkkäinen]
-
Bit-vector: targeted for bit-parallel algorithms, for example realized
with machine-word sized integers. Specializations for fixed number of bits.
[Kärkkäinen,Pion]
-
Generic CGAL Kernel, for example, 2D projection of 3D geometry [Pion]
-
Implement R.J. Lipton, R.E. Tarjan. A Separator Theorem for planar
graphs. SIAM J. Appl. Math. 36 (1979), pp. 177-189, as a generic
graph algorithm to work on graphs in the Boost
Graph Library, on the Halfedge Data Structure in
CGAL, and on graphs in LEDA.
[Kettner]
Previous knowledge in the domain is required!
-
Generic Mesh Algorithms [Kettner]
-
Point Location in 2D Triangulations and Planar Maps [Pion]
-
Image Processing Library: 2d-iterator, filter algorithms. [Kettner,Pion]
-
Generic compression algorithms. Huffman, Lempel-Ziv, arithmetic coding,
Burrows-Wheeler transform (bzip2). [Kärkkäinen,Pion,Kettner]
Shared with FoPra: An Algorithm Library for Massive Data Sets
Projects can also be shared with the FoPra An
Algorithm Library for Massive Data Sets. Note that this of
course adds an additional twist on the FoPra to be a project in this
class. Amonge the possible projects are (details similar to above
projects will be worked out on demand):
- Implement simple STL containers (queues, vectors,...), iterators, and
algorithms (find, for_each,...). How should simple STL components be
refined? For example giving hints for the kind of accesses in the future
so that the library can do prefetching.
- B-Trees: THE external memory search tree data structure.
Can be used to implement the STL sorted container classes (multi)map and (multi)set
- Priority Queues.
- Sorting Strings This
is algorithmically interesting because it is not quite obvious how
to achieve time O(N + n log n) for sorting n strings of total length N.
It is also interesting as an example of an algorithm that works with variable length
items. How can our library elegantly support that?
- Suffix Array Construction.
The key to many string applications: Compression using Burrows-Wheeler transform,
full text indexing, various Bioinformatics applications. Can our library
support efficient implementation of this nontrivial algorithm using
its built in primitives for sorting etc.?
- An elegent implementation of pipelining: Often the output of one
operation on a sequence of elements (filtering, transforming, sorting,...)
is the input for other operations. We can save I/Os by offering a mechanism
similar to the unix pipe ("|") operator. How can this be implemented elegantly
in an STL-like library?
- Implement a class Relation that supports the typical operation of
a relational database (select, join, project).
- External memory numerics: Sparse matrices, solving partial differential equations,...
- External memory minimum spanning trees
- External memory graph partitioning for planar graphs
- Geometric data structures and algorithms
- Frequent Item Set Data Mining
Lutz Kettner
(
<surname>@mpi-inf.mpg.de).
Last modified on Tuesday, 17-Jan-2006 17:53:40 MET.