Lawrence Livermore National Laboratory, United States of America
Current and emerging scientific workflows at the Lawrence Livermore National Laboratory (LLNL) require the integration of cloud technologies with traditional HPC to make discoveries. In this talk, we present prominent workflow examples, trends in these converged workflows and gaps that they face at one of the world's largest computing centers. Based on application examples, we will describe successful workflow patterns that make use of loose convergence between HPC clusters and on-premises container orchestration clusters. While the converged approach is making significant strides, we still find critical gaps such as lack of integration with resource and job management software, keeping it from realizing its full potential. We will discuss how LLNL is co-designing our critical software infrastructure with workflow teams, the computing facility and industry partners. Finally, we will highlight some of the key techniques we use to address outstanding challenges in resource expression and scheduling in a converged environment.