document how to generate sample list file #5

kdaily · 2018-08-22T20:36:40Z

No description provided.

bintriz · 2018-09-14T19:33:15Z

#!/bin/bash

synapse query "select name, id, sample_id_biorepository, sample_id_original, experiment_id, grant, group, assay, processingKit from syn7871084 where fileFormat='fastq'" \
    |tail -n+2 \
    |cut -f4- \
    |awk -F"\t" '{print $7"\t"$3"-"$4"-"$5"-"$6"\t"$1"\t"$2"\t"$8"\t"$9}' \
    |sort > tmp.fastq.txt
 
printf "group\tsample_id\tfile\tsynapse_id\tassay\tprocessingKit\n" > tmp.header.txt
{ cat tmp.header.txt; grep 10X tmp.fastq.txt; } > Samples.10X_WGS_fastq.txt
{ cat tmp.header.txt; grep wholeGenomeSeq tmp.fastq.txt |grep -v -e 10X -e '-535-' -e '-797-'; } > Samples.regular_WGS_fastq.txt
{ cat tmp.header.txt; grep -e '-535-' -e '-797-' tmp.fastq.txt; } > Samples.shallow_WGS_fastq.txt
{ cat tmp.header.txt; grep exomeSeq tmp.fastq.txt; } > Samples.WES_fastq.txt
{ cat tmp.header.txt; grep targetedSeq tmp.fastq.txt; } > Samples.Targeted_fastq.txt

rm tmp.fastq.txt

{ cat tmp.header.txt
synapse query "select name, id, sample_id_biorepository, sample_id_original, experiment_id, grant, group, assay, processingKit from syn7871084 where group='Vaccarino' and fileFormat='bam'" \
    |tail -n+2 \
    |cut -f4- \
    |awk -F"\t" '{print $7"\t"$3"-"$4"-"$5"-"$6"\t"$1"\t"$2"\t"$8"\t"$9}' \
    |sort \
    |grep -v 10X
} > Samples.regular_WGS_bam.txt

rm tmp.header.txt

This shell script is what I used to get the sample lists for BSMN ref brain data. Among columns, my pipeline only uses sample_id, file, synapse_id. The order of columns doesn't matter. Of course, this pipeline is pretty specific to BSMN ref brain sample.

kdaily · 2019-01-16T22:19:31Z

Can you put this in an executable script in this repository, and document it in the README? Then we can close.

kdaily · 2019-01-16T22:19:53Z

@attilagk it would be great if you can verify for @bintriz that this is sufficient.

kdaily self-assigned this Sep 12, 2018

kdaily added this to the Sprint 1 milestone Sep 12, 2018

kdaily modified the milestones: Sprint 1, Get current pipeline to work for another user Oct 5, 2018

kdaily assigned bintriz and unassigned kdaily Jan 16, 2019

kdaily assigned attilagk Jan 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document how to generate sample list file #5

document how to generate sample list file #5

kdaily commented Aug 22, 2018

bintriz commented Sep 14, 2018 •

edited

kdaily commented Jan 16, 2019

kdaily commented Jan 16, 2019

document how to generate sample list file #5

document how to generate sample list file #5

Comments

kdaily commented Aug 22, 2018

bintriz commented Sep 14, 2018 • edited

kdaily commented Jan 16, 2019

kdaily commented Jan 16, 2019

bintriz commented Sep 14, 2018 •

edited