Sparse matrix-matrix multiplication (SpMM) is a crucial kernel in various applications, including sparse deep neural networks [1]–[6], graph analytics [7], triangle counting [8], and linear algebra ...
Abstract: We consider the distributed memory parallel multiplication of a sparse matrix by a dense matrix (SpMM). The dense matrix is often a collection of dense vectors. Standard implementations will ...
Using a grid, the system designs a set of rectangular silicon structures filled with tiny pores. The system continually adjusts each pixel in the grid until it arrives at the desired mathematical ...
We took this version of HeCBench and are modifying it to build the CUDA and OMP codes to gather their roofline performance data. So far we have a large portion of the CUDA and OMP codes building ...
Apache 2.0. See LICENSE. This project uses third-party packages (e.g. PyTorch, NVIDIA cuDSS, Cython) and optional libraries (e.g. nvmath). See NOTICE for attributions and a pointer to their licenses.