Testdata validation

Code used to validate allele frequency estimates from our poolSeq data by comparing estimates from the same individual sequence data, as well as to validate megaSNPs as likely regions where paralogs are likely misaligning and causing false positive SNPs in our data.

Usage

If you use or are inspired by code from this repo, please site related manuscripts and data:

Data

NCBI Bioproject PRJNA744263 - https://www.ncbi.nlm.nih.gov/bioproject/PRJNA744263/
Dryad - contains all filtered SNP calls, pipeline config files, and metadata for SRA and Biosamples -
Zenodo - contains an archived release of this repository -

Lind et al. (in press) Haploid, diploid, and pooled exome capture recapitulate features of biology and paralogy in two non-model tree species. Accepted to Molecular Ecology Resources. Available on bioRxiv https://doi.org/10.1101/2020.10.07.329961

Repository structure

Below are the descriptions of notebooks in this repo. Notebooks can be viewed in the repository but are best viewed at https://nbviewer.jupyter.org (hyperlinks below). Notebooks 002 and 003 contain figures found in the main and supplemental texts.

Full repository

001_testdata_explore.ipynb

Explore the data, isolate the set of SNPs intersecting both (indSeq and pooLSeq) baseline-filtered datasets across both Doug-fir and Jack pine.

002_testdata_compare_AFs.ipynb

This notebook takes SNPs intesecting indSeq and poolSeq methods for Doug-fir and Jack pine from (001_testdata_explore.ipynb) and investigates filtering methods that will improve agreement between indSeq and poolSeq estimates.

003_testdata_validate_megaSNPs.ipynb

Validate sites that are called as heterozygote from haploid tissue as those potentially within a region subject to paralog misalignment.

004_misc_suppmat.ipynb

Calculate some numbers for some tables and the Supplemental Material

005_transfer_to_SRA.ipynb

Code to create SRA metadata, Biosample metadata, and to upload fastq files to NCBI Short Read Archive ftp with python

pythonimports in notebooks can be found here: https://github.com/brandonlind/pythonimports

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

001_testdata_explore.ipynb

001_testdata_explore.ipynb

002_testdata_compare_AFs.ipynb

002_testdata_compare_AFs.ipynb

003_testdata_validate_megaSNPs.ipynb

003_testdata_validate_megaSNPs.ipynb

004_misc_suppmat.ipynb

004_misc_suppmat.ipynb

005_transfer_to_SRA.ipynb

005_transfer_to_SRA.ipynb

README.md

README.md

Repository files navigation

Testdata validation

Usage

Repository structure

About

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
001_testdata_explore.ipynb		001_testdata_explore.ipynb
002_testdata_compare_AFs.ipynb		002_testdata_compare_AFs.ipynb
003_testdata_validate_megaSNPs.ipynb		003_testdata_validate_megaSNPs.ipynb
004_misc_suppmat.ipynb		004_misc_suppmat.ipynb
005_transfer_to_SRA.ipynb		005_transfer_to_SRA.ipynb
README.md		README.md

brandonlind/testdata_validation

Folders and files

Latest commit

History

Repository files navigation

Testdata validation

Usage

Repository structure

About

Resources

Stars

Watchers

Forks

Languages