Pacific Northwest National Laboratory (PNNL) Richland, United States of America
Tensor algebra computation is an important computation kernel used in many applications in different areas. Tensors represented real-world data are usually large and sparse. Sparse tensors are usually stored in a compressed way. Lot of efforts are focusing on improving the performance. There are many challenges in improving the performance. First, there are many storage formats to store the sparse tensors and no one format is good in all cases. Users need to choose the proper format according to the feature of the sparse tensors. Secondly, optimizing sparse computation is difficult. Sparse computation contains many indirect memory accesses and write dependencies. Besides this, the computation kernels are also variant with different tensor expressions and different storage formats. It is necessary to use different optimizations in different computations kernels. Thirdly, there are many hardware platforms at the back end nowadays. Different hardware platforms require different code optimizations for high-performance.
To handle some of the challenges, we propose a compiler-based approach by building our sparse tensor compiler based on the multi-level Intermediate Representation (MLIR) framework. By building our sparse tensor compiler based on MLIR infrastructure, our compiler supports different hardware platforms.
Our sparse tensor compiler supports several formats, such as COO, CSF and so on. Based on the proposed internal tensor storage, we propose an automatic code generation algorithm to generate the computation kernel code. We apply various code optimizations to guarantee the performance. We believe our approach is a promising way to support high-performance in a more general and flexible way.