Approximately 30% of human cancers are caused by interactions of a mutated RAS protein with the downstream RAF near cell membranes. The DoE Cancer Pilot 2 campaign conducted numerous molecular dynamics (MD) simulations modeling RAS proteins in contact with the lipid bilayer of a cell membrane. In this work, we present a new deep learning (DL) based technique to explore temporal correlations between the protein and the 14 types of lipid species. Our early experiments have already led to several key findings.
We investigate the application of transformer models, such as BERT, which train a deep neural network to predict the past and future changes in RAS conformations given a sequence of simulation frames as input. Given an ordered sequence of frames of the lipid concentrations around the RAS protein, the models identify periods where the RAS conformation is stable or likely to change state with 60-70% accuracy. Models trained on sequences of frames of the distance of the RAS from the cell membrane achieve accuracy of 85-90%. We show that model accuracy improves with increasing sequence length and that the models utilize time-based information in the input sequences.
Our initial experiments indicate the possibility of extracting temporal correlations from MD simulations using appropriate DL models. Our work paves way to a new form of analysis for MD trajectories focusing on the prediction of events of interest. There still remain many open questions, foremost of which pertains to distilling the correlations to understand the order of events and assess causation.