A simple optimization strategy for the computation of 3D finite-differencing kernels on many-cores architectures is proposed. The 3D finite-differencing computation is split direction-by-direction and exploits two level of parallelism: in-core vectorization and multi-threads shared-memory parallelization. The main application of this method is to accelerate the high-order stencil computations in numerical relativity codes.
We use cookies on our website to ensure you get the best experience. Read more about our cookies here.
RSS feed for
Preprints: The Multidisciplinary Preprint Platform
Paste the URL in your RSS reader:
https://www.preprints.org/rss
If you want to subscribe our website in subject-wide, you could find the URL by choosing in the following: