Oak Ridge National Laboratory (ORNL), United States of America
We are motivated by novel methods for mining large-scale corpora of scholarly publications with tens of millions of papers like PubMed. In this setting, analysts seek to discover how concepts relate to one another. They construct graph representations from annotated text, then formulate the relationship-mining problem as computing all-pairs shortest paths (APSP), which becomes a significant bottleneck. We present a new high-performance Floyd-Warshall algorithm for distributed-memory parallel computers accelerated by GPUs called DSNAPSHOT (Distributed Accelerated Semiring APSP). For our largest experiments, we ran DSNAPSHOT on a connected input graph with millions of vertices using 4,096 nodes (24,576 GPUs) on Oak Ridge National Laboratory’s Summit supercomputer. We find DSNAPSHOT achieves a sustained performance of 136×10^15 floating-point operations per second (136 petaflops) at a parallel efficiency of 90% under weak scaling and, in absolute speed, 70% of the best possible performance given our computation (in the single-precision tropical semiring or “min-plus” algebra).