
The focus is both on high performance and high productivity. We enhance the FEM solver package FEAST with
GPU functionality in a minimally invasive fashion, with less than 1% of the code basis being affected [1].
Moreover, applications based on FEAST can benefit from the GPU acceleration without any code changes. We
explore the large-scale scalability [2] and the practical benefits and limits [3] of this approach in detail.

1. Bandwidth in a typical GPU-node. Algorithms executing on the GPU-cluster must be able to tolerate the enormous discrepancy between the bandwidth on the co-processor board and the bandwidth from board to board that has to pass through the main memory of the hosts.
2. Displacements and van Mises stress of an object under load, computed with FeastSolid on a heterogeneous 16 node cluster using GPUs as scientific co-processors; no code changes, equal accuracy, 2.6x speedup.


|

dozrtska@mped.gpm.fni-i