Parallel convolution gridding for Radio Astronomy applications running on KNL and GPU
7 Keble Road, Oxford, OX1 3QG
We present results from recent joint work with the radio astronomy department at Oxford University. The work has relevance to the Square Kilometer Array (SKA). Convolution gridding is one of the first stages in processing raw radio astronomy observations (visibilities) into usable images. The algorithm is totally bandwidth bound and full of stochastic race conditions. We present optimised implementations of the algorithm developed for multicore x86, KNL and P100 GPUs. Through a combination of tiling the grid, bucket sorting the data and keeping local data in cache/registers we manage to obtain satisfying speedups on all platforms over the original serial code. The KNL is consistently the worst performer due to small shared caches, while the P100 is consistently the fastest.