Kokkos provides an abstract interface for portable, performant shared memory-programming.
It is a C++ library that offers
parallel_reduce and similar functions
for describing the pattern of the parallel tasks. The execution policy determines how the
threads are executed. For example, this influences the sizes of blocks of threads or if
static or dynamic scheduling should be used. The library abstracts the kernel as a function
object that can not have any user defined parameters for its
operator(). Arguments have
to be stored in members of the function object coupling algorithm and data together. KOKKOS
provides both, abstractions for parallel execution of code and data management.
Multidimensional arrays with a neutral indexing and an architecture dependent layout are
available, which can be used, for example, to abstract the underlying hardwares preferred
memory access scheme that could be row-major, column-major or even blocked.
Thrust is a parallel algorithms library resembling the C++ Standard Template Library (STL).
It allows to select either the CUDA, TBB or OpenMP back-end at make-time. Because it is
based on generic
device_vector container objects, it is tightly coupling
the data structure and the parallelization strategy. There exist many similar libraries such
as ArrayFire (CUDA, OpenCL, native C++),
VexCL (OpenCL, CUDA),
ViennaCL (OpenCL, CUDA, OpenMP) and
hemi (CUDA, native C++).
Phalanx See here It is very similar to alpaka in the way it abstracts the accelerators. C++ Interface provides CUDA, OpenMP, and GASNet back-ends