Skip to content

This repository collects pipelines, codes, and some intermediate results for the study of mosaic SNV/Indels for sperm, blood, saliva samples of a transmission genomic study.

License

Notifications You must be signed in to change notification settings

shishenyxx/Sperm_transmission_mosaicism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sperm transmission mosaicism

This repository collects pipeline, code, and some intermediate results for the study of mosaic SNV/Indels for sperm, blood, and saliva samples of a small cohort. Raw WGS data of this study are available here. Genotyping results of each sample is provided here. Re-analysis of the transmission of paternal mosaic variants from the ASD trios is provided here. 300x WGS panel of normal are available here.

The pipelines and analysis are derived from our recent large-scale sperm study, of which the code could be found here.


1. Pipelines for the process of whole-genome sequencing data

1.1 Pipelines for WGS data process and quality control

Pipelines for pre-processing of the bams.

Code for depth of coverage and insertsize distribution.

1.2 Code for the population origin analysis

Pipeline for population analysis, and code for plot.

1.3 Pipelines for mosaic SNV/indel calling and variant annotations

Pipelines for MuTect2 (paired mode) and Strelka2 (somatic mode) variant calling from WGS data

Pipelines for MuTect2 (single mode) with the "Full Panel of Normal" version is used. The MuTect2 (single mode) result is followed by MosaicForecast, and the variant annotation pipeline.


2. Pipelines for the process of Massive Parallel Amplicon Sequencing (MPAS)

2.1 Pipelines for MPAS data alignment and processing

Pipelines for alignment, processing, and germline variant calling of MPAS reads.

2.2 Pipelines for AF quantification and variant annotations

Pipelines for AF quantification and variant anntations.

Code to filter and annotate on MPAS data.


3. Pipelines for the data analysis, variant filtering, comprehensive annotations, and statistical analysis

3.1 Pipelines for mosaic variant determination, annotations, and plotting

After variant calling from different strategies, variants were annotated and filtered by a python script and positive mosaic variants as well as the corresponding transmission to multiple samples and additional information were annotated.

3.2 Pipelines for statistically analysis, and the related plotting

Code for the estimation of expected transmissions assuming independent transmission via a dynamaic programming algorithm.

Code and example data for the permutation analysis to estimate the indepence of transmission in each family.

Code and data for the re-analysis of transmission in 8 ASD families previously analyzed in the first and second study.


4. Contact:

📧 Martin Breuss: martin.breuss@cuanschutz.edu

📧 Xiaoxu Yang: xiy010@health.ucsd.edu, yangxiaoxu-shishen@hotmail.com

📧 Joseph Gleeson: jogleeson@health.ucsd.edu, or the Gleeson lab gleesonlab@health.ucsd.edu


5. Cite the code

Breuss MW, Yang X, et al., Gleeson JG. Unbiased mosaic variant assessment in sperm: a cohort study to test predictability of transmission. 2022. (eLife, DOI:10.7554/eLife.78459, PMID:35787314)

Sperm_Mosaic_Cover

About

This repository collects pipelines, codes, and some intermediate results for the study of mosaic SNV/Indels for sperm, blood, saliva samples of a transmission genomic study.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published