GPU Scheduling and Parallel Computing in Rendering

Leader of the Group: Dr.techn. Markus Steinberger

- Group's website -



Vision and Research Strategy

We investigate novel algorithms for scheduling dynamic workloads on the GPU, including algorithms for efficient task scheduling, memory management, and dynamic workload prioritization. Our research enables algorithms – which previously did not seem fit for GPU execution – to harness the full power of the GPU. Applying our insights to computer graphics, we are interested in highly efficient and on-the-fly generation of procedural content, prioritizing image synthesis for deadline-driven foveated rendering, and alternative rendering pipeline designs with new views on rasterization.



Research Areas and Achievements

Real-time Scheduling

While the modern graphics processing unit (GPU) offers massive parallel compute power, the ability to influence the scheduling of these immense resources is severely limited. Therefore, the GPU is widely considered to be only suitable as an externally controlled co-processor for homogeneous workloads which greatly restricts the potential applications of GPU computing. To address this issue, we present a new method to achieve fine-grained priority scheduling on the GPU: hierarchical bucket queuing (CGF’17).


Efficient Linear Algebra

Sparse matrix vector multiplication (SpMV) is the workhorse for a wide range of linear algebra computations. In a serial setting, naive implementations for direct multiplication and transposed multiplication achieve very competitive performance. In parallel settings, especially on graphics hardware, it is widely believed that naive implementations cannot reach the performance of highly tuned parallel implementations and complex data formats. Relying on recent advances in GPU hardware, such as fast hardware supported atomic operations and better cache performance, we show that a naive implementation can reach the performance of state-of-the-art SpMV implementations (HPEC’16). Building on our efficient linear algebra primitives, we investigated geometry processing strategies on top of efficient linear algebra formulations. In this way, geometric computations and topological modifications translate into concise linear algebra operations, which can be efficiently executed on the GPU (EG’17).


Procedural Generation

To advance high-performance procedural generation on the graphics processing unit (GPU), we present the concept of operator graph scheduling. The operator graph forms an intermediate representation that describes all possible operations and objects that can arise during a specific procedural generation. Using the operator graph, we show that procedural generation on the GPU can be automatically optimized leading to speedups of 8 to 30x over the previous state of the art (SIGGRAPH Asia 2016). Building on our efficient procedural generation framework, we show that genetic algorithms (GA) can be used to control the output of procedural modeling algorithms, efficiently generating models that fulfill high-level constraints (EG’17).