Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sufficient information to run the pipeline #1

Open
Laolga opened this issue Feb 5, 2024 · 3 comments
Open

Sufficient information to run the pipeline #1

Laolga opened this issue Feb 5, 2024 · 3 comments

Comments

@Laolga
Copy link

Laolga commented Feb 5, 2024

Dear authors,
Please provide information needed to execute your pipeline:

  1. what is the format of the samplesheet
  2. how can one know barcodes before running any analysis?
  3. What is BARCODE_START_CYCLE?
  4. What is rc?
  5. What is ad?
@pas2182
Copy link
Contributor

pas2182 commented Feb 6, 2024

  1. When you sequence with an Illumina sequencer, there is a standard file called a "sample sheet" that is required for demultiplexing by the Illumina Experiment Manager. It is also required for demultiplexing with cell ranger, which provides details on formatting this file here: https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/inputs/cr-mkfastq.
  2. 10x Genomics uses a pre-defined set of barcode sequences for each kit. For example, for the 10x Multiome kit, you can find the details of the barcode sequences here: https://kb.10xgenomics.com/hc/en-us/articles/4412343032205-Where-can-I-find-the-barcode-whitelist-s-for-Single-Cell-Multiome-ATAC-GEX-product.
  3. BARCODE_START_CYCLE is the cycle of sequencing in the cell-identifying barcoding-containing read where the first base of the cell-identifying barcode is read.
  4. rc = revserse complement. Use this option if your cell-identifying barcode list contains the reverse complement of the barcodes read by the sequencer.
  5. ad = adapter. This is the adapter sequence to be trimmed from the end of short fragments.

@mariaZig
Copy link

Hello,

Thanks for the custom pipelines and for this nice protocol!

On a similar note, I'm having trouble understanding the exact input I should use to run the DNA-based pipeline.

Would it be possible to give me a specific example?

Please provide if possible an example samplesheet.csv file and also a specific value for the "--directory" parameter assuming that I already have my FASTQ files ready, so I don't need to run cellranger to produce them from the BCL files.

Thanks in advance,
Maria

@tro2104
Copy link

tro2104 commented May 28, 2024

Hello Maria,

Here is an example of a few runs and how to set up the software. It assumes the fastq files are in the directory that bcl2fastq would create. So you need to make that path and put your fastq's in it if you don't have that path already.

Create conda environment
conda create -n cutadapt -c bioconda -c conda-forge cutadapt python=3.9 bwa pysam samtools numpy3
Download dna10x pipeline from github
Create sample sheet in directory with the pipeline
vim ss.csv
i
Lane,Sample,Index
*,PTO035,SI-NA-F1
wq

Download reference or use 10x cellranger references

Make Directories
Within the dna10x directory with all the associated .py files create the following path
mkdir PTO035/outs/fastq_path/PTO035/PTO035/

Run pipeline
<With exisitng fastqs, assumed to be located in dna10x/PTO035/outs/fastq_path/PTO035/PTO035/>
nohup python dna10x.py --samplesheet ss.csv -d PTO035 -b /opt/cellranger-atac-2.0.0/lib/python/atac/barcodes/737K-arc-v1.txt -t 16 -r /opt/refdata-cellranger-arc-GRCh38-2020-A-2.0.0/fasta/genome.fa -i 1000 -m 0.9 -c -sf -p 1 -rc -ad CTGTCTCTTATACACATCT &

<With BCL to fastq, may need to install bcl2fastq>
nohup python dna10x.py --bcl ~/230407_NB551203_0654_AH5LM2BGXT --samplesheet ss.csv -d PTO035 -b /opt/cellranger-atac-2.0.0/lib/python/atac/barcodes/737K-arc-v1.txt -t 16 -r /opt/refdata-cellranger-arc-GRCh38-2020-A-2.0.0/fasta/genome.fa -i 1000 -m 0.9 -p 1 -rc -ad CTGTCTCTTATACACATCT -c &

Tim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants