As the computing landscape is being reshaped by the dramatic shift towards ubiquitous parallelism, and by the sheer scale of data, extracting performance from existing applications gives rise to formidable challenges. In digital geometry processing, this problem gets amplified by the unstructured nature of data (e.g. meshes) and by the serial nature of prevalent algorithmic solutions. As a result, the high performance promise of modern hardware seems elusive.
The thrust behind our research effort is to rethink solution development with parallelism built-in from the ground up as opposed to the current trend of looking at parallelization as a mere post processing step left for "Ninja coders" or automatic optimization software. We examine the whole solution development pipeline (problem abstraction, data structures, mathematical modeling, and implementation), with the goal of producing simple yet streamlined solutions capable of unleashing the computational power afforded by modern hardware.
We seek to make high performance in digital geometry processing accessible at a high level of abstraction without requiring practitioners to delve into the low level intricacies of parallelism. To facilitate deployment, our solutions need to be independent of the underlying granularity, i.e. the same code can be operated on clusters as well as on single graphics cards (GPUs). Our efforts are guided by parsimonious memory usage, improved memory access patterns, and cache performance. To avoid data redundancy, we target seamless transition between data traversal and numerics as it is a valuable asset for coding convenience and enhanced performance.
The research within the group targets three complementary areas: data structures, numerical kernels, and applications.
For unstructured meshes, we have introduced the mesh matrix, a lean sparse matrix representation which readily brings forward memory layout. We have endowed this representation with action maps, which symbolically augment standard sparse linear algebra kernels to perform both traversal and numerical computations on meshes. For dynamic scenarios, we have developed a novel hierarchical sparse matrix data structure and we have introduced a fully-dynamic graph data structure capable of intensive edge and vertex updates within the limited memory available on the Graphics Processing Unit (GPU).
Although efficient sparse matrix linear algebra kernels exist for different hardware architectures, e.g Intel MKL and Nvidia cuSPARSE, these closed source libraries focus on standard algebra primitives. Highly efficient mesh operations based on matrices will also make use of these primitives, but they are not sufficient for carrying advanced mesh processing tasks efficiently. In our formalization, mesh operations require a sequence of matrix operations, often operate locally on data, and change mesh connectivity. These operations pose new challenges for sparse matrix linear algebra formulations as high performance requires the fusion of successive matrix operations, considering cache locality and fast on-chip memory when performing mesh-local operations, and considering alternative data representations that allow for efficient changes to the matrix and thus the mesh structure. Our starting point is the development of sparse efficient matrix-vector and sparse matrix-matrix multiplication for the GPU. Our recent results on the entire University of Florida Sparse Matrix Collection, comparing against vendor provided implementations and state-of-the-art approaches are highly favorable.
So far, we have targeted some of the most commonly used and far reaching applications across various fields. All of which are believed to be hard to parallelize. In computer graphics, we addressed the problem of subdivision surfaces, a widely used modeling paradigm where an extremely simple polygonal shape representation is enriched throughout a combination of vertex insertions and smoothing iterations, which yield visually pleasant meshes suitable for use in production. In meshing, we have targeted the generation of Voronoi diagrams on surfaces. Voronoi Diagrams are a common theme across several disciplines and are arguably one the most studied problems in computational geometry. Their extension to the surface setting further complicates vectorization as metric estimation poses serious challenges. In numerical simulation, we focus on the whole pipeline from finite elements assembly to numerical solution strategies. Numerical evidence suggests that the assembly problem weighs heavily on performance and impedes scalability. Across all these application, we broke the high performance deadlock throughout novel geometric and numerical abstractions and we have shown that it is possible to eliminate the unpleasant idle-time modelers and engineers spend waiting before seeing the outcome of their modeling and simulation efforts.