In this workshop, we will demonstrate the use of Narratives and Apps in The Department of Energy Systems Biology Knowledgebase (KBase) to create reproducible, flexible workflows for the analysis of next-generation sequencing (NGS) data in both model plants and crops. We will present a case study using SRA data and also lead a hands-on session in which users can create their own Narratives, import data, run Apps, and analyze the results.
Case Study: Plant metabolism is reconfigured in response to invasion by pathogens. When mediated by transcriptional changes, this metabolic reconfiguration can be detected by comparing genome-wide mRNA sequence (mRNA-seq) data between infected and control plants. By integrating transcript abundances with a model of primary metabolism, we can then identify biochemical reactions where transcription is altered upon infection in both model and crop species. This approach can be extended to study the responses of plant metabolism to a variety of environmental conditions.
KBase: The Department of Energy Systems Biology Knowledgebase (KBase) is a knowledge creation and discovery environment designed for both biologists and bioinformaticians. KBase integrates a large variety of public data and analysis tools into Narratives, an easy-to-use graphical user interface that leverages DOE computational infrastructure to perform sophisticated systems biology analyses.
Narratives and Apps: Narratives are dynamic, interactive documents that promote collaboration and reproducibility of the scientific results generated by Apps. Apps are analysis tools you use in KBase. Apps interoperate seamlessly within a Narrative to enable a range of scientific workflows. A finished Narrative is persistent and represents a complete record of the methods for a computational experiment. Public Narratives serve as resources for the user community by capturing valuable datasets, associated computational analyses, and scientific context describing the rationale behind a scientific study. The Narrative structure simplifies the re-purposing, re-application, and extension of scientific techniques, thereby supporting reproducible and transparent research in KBase.
Presentation: The KBase tools featured in our case study are broadly applicable for profiling plant gene expression using NGS data. We will demonstrate the following workflow: In a KBase Narrative, we import public mRNA-seq data from susceptible Arabidopsis thaliana and Solanum lycopersicum (tomato) samples treated with either the bacterial pathogen Pseudomonas syringae DC3000 or sterile buffer as a control. Using KBase Apps, we process the Illumina reads to obtain consistent length, depth, and quality across datasets. We align the reads to reference genomes and assemble transcripts using a customized combination of KBase Apps running published, open-source software. We then generate tables with normalized, average transcript abundances for infection and control conditions.
Our case study also features user-friendly tools for building metabolic models from plant genomes in KBase. Transcript abundances are integrated with genome-scale networks of primary metabolism to find the reactions with the largest expression changes between conditions.
Hands-on session: In the second part of the workshop, users will receive guidance on importing their own NGS data to a KBase Narrative and selecting from among the Apps described above to analyze gene expression or construct metabolic models.