Skip to content

Preprocessing and analysis files related to the Strongyloides RNAseq Browser

License

Notifications You must be signed in to change notification settings

HallemLab/Bryant-DeMarco-Hallem-2021

Repository files navigation

Strongyloides RNA-seq Browser Preprocessing and Analysis

Preprocessing and analysis related to the Strongyloides RNA-seq Browser, a web-based Shiny App for browsing and on-demand analysis of Strongyloides spp. RNA-seq datasets.

Table of Contents

  1. General Information
  2. App Access
  3. Sources
  4. License
  5. Authors
  6. Update Notes

General Information

This repository contains non-responsive code for the pre-processing and analysis for Strongyloides spp RNA-seq datasets. Preprocessed data is used as inputs for the Strongyloides RNA-seq Browser, a Shiny Web App for on-demand browsing and analysis of published bulk Strongyloides RNA-seq data. It also contains analysis code used to generate results discussed in Bryant, DeMarco, and Hallem (2021). A preprint version of the manuscript is available on bioRxiv.. A final version of this manuscript is published in G3.

The sections below describe the contents of the primary subfolders within this repository.

Preprocessing

This folder contains RMarkdown files for each species; these files contain code for the alignment of raw reads, data filtering and normalization, voom variance-normalization of count data, and collection of gene annotation information. RMarkdown files generate outputs that act as essential inputs to the Strongyloides RNA-seq Browser App. RMarkdown files are knitted into PDF files; see those PDF files for plots illustrating the effect of filtering and normallization on raw data inputs. See the readme file located in the subfolder for additional details. This folder also contains data files used in offline data pre-processing, including study design files and raw transcripts per million datasets for each species, as well as an Ensembl Compara database of parasite gene sets.

Analysis

This folder contains RMarkdown files for each species, in which we present example analyses. These analyses include hierarchical clustering and principal component analyses of species samples, limma-voom differential expression (example results echoing analyses completed by the Shiny App), benchmarking of Browser datasets and differential expression analyses against previously published datasets, and an example gene set enrichment analysis. Rmd files are knitted into html files, and both PCA and Benchmarking plots are saved in a Plot subfolder in the Outputs folder. See the readme file located in the subfolder for additional details.

Supplemental Materials

This folder contains supplemental figures and files from the manuscript of Bryant, DeMarco, and Hallem (2021). See subfolder README for details.

Time_course_DGE

This folder contains an RMarkdown file and cache that tests for differential expression on longitudinal data using the ImpulseDE2 package. The ImpulseDE2 method contrasts with DE algorithms such as limma that treat time points independently. Instead, ImpulseDE2 seeks to identify genes that display specific trajectories of differential gene expression over time. An Impulse model is designed to capture 4 different expression trajectories: monotonous decrease, monotonous increase, transient decrease (valley), and transient increase (peak). The analysis included in this folder specifically tests an implementation of ImpulseDE2 analysis on a subset of the S. stercoralis dataset: life stages FLF, PF, iL3, iL3a.

App Access

To access a stable deployment of the Strongyloides RNA-seq Browser Web App, please visit: hallemlab.shinyapps.io/strongyloides_rnaseq_browser/

To view full source code for the Strongyloides RNA-seq Browser, please visit the app repository.

Sources

License

This project is licensed under the MIT License.

Authors

Update Notes

Updated by A.S.B. on approximately 11-3-21 to include S. stercoralis free-living male RNA-seq data published by Gonzalez Akimori et al 2021.