Skip to content

kmuench/16p_resource

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

16p_resource

Introduction | Getting Started | Additional Info | Versioning | Authors | Acknowledgements

Introduction

This repo contains the code used to generate analyses and generate figures for Roth, Muench et. al. This paper describes a new resource of patient-derived iPSCs bearing a 16p11.2 copy number variant, explores the potential utility of these clones, and describes the possible impact of clonal integration on iPSC-derived tissue models. I have written this README with other biologists in mind who might be interested in following up on our analyses or investigating their own integration effects.

It is divided into two sections. The names are a bit of a misnomer, and left over from an earlier revision:

  • "figure5": contains a differential expression analysis of the integration-negative clones aligned with STAR and counted with htseq-count.
  • "figure6": contains an independent bioinformatic comparison of integration-negative and Integration-positive clones aligned using kallisto.

Getting Started

Data

The data will be made available on GEO (under embargo during revisions as of October 12, 2020).

Dependencies

Figure 5

setup.Rmd

DESeq.Rmd

  • In addition to the packages required for Setup, install the following

Figure 6

tximport_Setup.Rmd

deseq.Rmd

heatmaps.Rmd

barPlots.Rmd

  • In addition to the packages required for Setup, install the following

GSEA.Rmd

How to run

1. Fill out userVars.csv.

I thought it might be easier to import and document variables using this spreadsheet rather than using a .bashrc file.

2. Run the Rmd files.

Within each figure directory, the code has been broken up into several parts. You should run the code in this order:

Figure 5

  1. setup.Rmd
  2. deseq.Rmd

Figure 6

  1. tximport_setup.Rmd
  2. deseq.Rmd
  3. barPlots.Rmd OR heatmaps.Rmd OR GSEA.Rmd

This code is written to have a separate output file for each distinct date of run, when the date of run is defined within the userVars.csv file. This way, the user can maintain copies of all output as small tweaks are made to the code.

Additional Info

For the alignment and counting steps, I used one of two different aligners

I performed both of these on the Stanford Center for Personalized Medicine Cluster. I recommend running STAR on a cluster. In theory, you should be able to run kallisto on a laptop.

I performed subsequent analyses using R and RStudio.

Versioning

For the versions available, see the tags on this repository.

Authors

Acknowledgments

  • Thank you to PurpleBooth for the README template
  • Thank you to the Bader Lab for their GSEA tutorial.
  • Thank you to John Hanks at the SCPGM cluster and the team at the Stanford Functional Genomics Facility for their help supporting this work.

About

Code used to perform analyses and generate figures for Roth, Muench et. al.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages