Graph pattern matching, which aims to discover structural patterns in graphs, is considered one of the most fundamental graph mining problems in real applications. Despite previous efforts, existing systems face two main challenges. First, the inherent symmetry existing in patterns can introduce a large amount of redundant computation. Second, different matching orders for a pattern have significant performance differences.
To address these challenges, we propose GraphPi, a high performance distributed pattern matching system. GraphPi utilizes a new algorithm based on 2-cycles that outputs multiple sets of asymmetric restrictions; each set eliminates redundancy completely. We also design an accurate performance model to select the optimal matching order and asymmetric restriction set for efficient pattern matching. We evaluate GraphPi on Tianhe-2A supercomputer. Results show that GraphPi outperforms the state-of-the-art system by up to 105x with 6 real-world graph datasets on a single node. We also scale GraphPi to 1024 computing nodes.