Bulletin of the American Physical Society, Indiana, United States Of America, 18 - 22 November 2022, pp.1-2
Conference Paper / Summary Text
United States Of America
Middle East Technical University Affiliated:
Heterogeneous computing architectures have become an integral feature of modern supercomputers. In this work, we present the performance benchmarking results of a discontinuous Galerkin, spectral-element solver for the compressible Navier-Stokes equations on different GPU-based computing platforms. The solver uses OCCA, an open-source library that provides the portability layer to offload targeted kernels across different architectures and vendor platforms, and achieve application portability. Profiling of the solver is conducted, and the most compute-expensive kernels are identified. A mini-app called cnsBench is developed based on the full solver to benchmark and investigate performance characteristics of different core kernels. The kernel performance metrics will be presented and compared across different GPU architectures, such as NVIDIA, Intel, and AMD GPUs, and programming models, such as CUDA, OpenCL and SYCL. The kernel algorithms and memory access patterns are analyzed to provide insights regarding computational bottlenecks and approaches to further optimize performance of these kernels. These efforts will guide future development of compressible flow applications that can leverage the full potential of next generation exascale supercomputers and beyond.