GitHub - NorwegianVeterinaryInstitute/DemultiplexRawSequenceData: A workflow automation script: demultiplex the library sequence, run quality checks, deliver to archiving and processing afterwards

demultiplex_script.py

Demutliplex a MiSEQ or NextSEQ run, perform QC using FastQC and MultiQC and deliver files either to VIGASP for analysis or NIRD for archiving

Replace with relevant run id. Example : "190912_M06578_0001_000000000-CNNTP". RunID breaks down like this (date +%y%m%d/yymmdd_MACHINE-SERIAL-NUMBER_AUTOINCREASING-NUMBER-OF-RUN_000000000-FlowcellID-used-for-this-run .

Note: don't bother with enforcing ISO dates for the directory name. It is an Illumina standard and they do not care.

Software requirements

Python > v3.9
bcl2fastq ( from https://emea.support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software/downloads.html )
FastQC    ( https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ )
MultiQC   ( pip3 install multiqc )

Directory structure on seqtech00

├── bin															binaries and symlinks of binaries live here
├── clarity 													exported Illumina Clarity directory
│   ├── gls_events	
│   ├── logs 													clarity logs go here
│   ├── miseq													miseq-clarity stopover directory
│   │   └── M06578												per serial number
│   │   	└── samplesheets									samplesheets for this serial number gohere
│   └── nextseq													nextseq-clarity stopover directory
│   	└── NB552450											per serial number
│   		└── samplesheets									samplesheets for this serial number gohere
├── demultiplex													demultiplexed data directory
├── for_transfer												data ready to be transfered over to NIRD or VIGASP
├── logs														all demultiplexing logs go here
├── rawdata														raw data directory, sequencers write here
│   ├── bad_runs												runs which are bad, or rejected
│   └── control_runs											water/other control runs
└── samplesheets												cummulative backups of all samplesheets
├── M06578 -> /data/clarity/miseq/M06578/samplesheets/			symlinmk to sample sheets for convinience
└── NB552450 -> /data/clarity/nextseq/NB552450/samplesheets/	symlinmk to sample sheets for convinience

Procedure

MiSeq writes as MiSEQ- to /data/scratch; shared folder Z:\ (alias rawdata) in MiSeq
Lab members modify an existing SampleSheet.csv file to include the new project data, then save the new file to the <RunId> folder in Z:\ and a copy within Z:\SampleSheets\ as <RunId>\SampleSheet.csv

Example:

Z:\190912_M06578_0001_000000000-CNNTP
        ├── SampleSheets.csv
Z:\SampleSheets
        ├── 190912_M06578_0001_000000000-CNNTP.csv

Cron job runs every 30 minutes if it finds a new run, RTAComplete.txt and SampleSheet.csv files within the run new, it starts the demultiplexing script
It can be manually started as below

/usr/bin/python3 /data/bin/demultiplex_script.py \<RunID\>

as the relevant user.

Name		Name	Last commit message	Last commit date
Latest commit History 362 Commits
ansible		ansible
.gitignore		.gitignore
CHANGELOG		CHANGELOG
LICENCE.md		LICENCE.md
README.md		README.md
count_and_list_unique_project_runs_in_2022.py		count_and_list_unique_project_runs_in_2022.py
cron_job.py		cron_job.py
crontab		crontab
current_demultiplex_script.py		current_demultiplex_script.py
demultiplex_script.py		demultiplex_script.py
test.sh		test.sh
transfer_to_NIRD.md		transfer_to_NIRD.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ansible

ansible

.gitignore

.gitignore

CHANGELOG

CHANGELOG

LICENCE.md

LICENCE.md

README.md

README.md

count_and_list_unique_project_runs_in_2022.py

count_and_list_unique_project_runs_in_2022.py

cron_job.py

cron_job.py

crontab

crontab

current_demultiplex_script.py

current_demultiplex_script.py

demultiplex_script.py

demultiplex_script.py

test.sh

test.sh

transfer_to_NIRD.md

transfer_to_NIRD.md

Repository files navigation

demultiplex_script.py

Directory structure on seqtech00

Procedure

About

Releases

Packages

Contributors 4

Languages

License

NorwegianVeterinaryInstitute/DemultiplexRawSequenceData

Folders and files

Latest commit

History

Repository files navigation

demultiplex_script.py

Directory structure on seqtech00

Procedure

About

Topics

Resources

License

Stars

Watchers

Forks

Languages