2021 journal article

PBBFMM3D: a parallel black-box algorithm for kernel matrix-vector multiplication

Journal of Parallel and Distributed Computing, 154, 64–73.

TL;DR: A parallel black-box method for computing kernel matrix-vector multiplication, where the underlying kernel is a non-oscillatory function in three dimensions, which reduces the cost to $O(N)$ work. (via Semantic Scholar)
Source: ORCID
Added: November 1, 2023

Kernel matrix-vector product is ubiquitous in many science and engineering applications. However, a naive method requires $O(N^2)$ operations, which becomes prohibitive for large-scale problems. We introduce a parallel method that provably requires $O(N)$ operations to reduce the computation cost. The distinct feature of our method is that it requires only the ability to evaluate the kernel function, offering a black-box interface to users. Our parallel approach targets multi-core shared-memory machines and is implemented using OpenMP. Numerical results demonstrate up to $19\times$ speedup on 32 cores. We also present a real-world application in geostatistics, where our parallel method was used to deliver fast principle component analysis of covariance matrices.