Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

additional sequences #373

Open
ljy-sys opened this issue Sep 15, 2023 · 4 comments
Open

additional sequences #373

ljy-sys opened this issue Sep 15, 2023 · 4 comments

Comments

@ljy-sys
Copy link

ljy-sys commented Sep 15, 2023

Thanks for this tools to help us analysis the smartseq3 data,but when i using the parameter “additional_files” to add a exogenous sequence,the result without this sequence is not a umicount of 0, other expression of other genes was consistent with that without this sequence. so, I sincerely hope to get your help, thank you.

this is my .yaml file:

project: Q2Q4
sequence_files:
file1:
name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R1.fastq.gz
base_definition:
- cDNA(23-150)
- UMI(12-19)
find_pattern: ATTGCGCAATG
file2:
name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R2.fastq.gz
base_definition:
- cDNA(1-150)
file3:
name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.I1.fastq.gz
base_definition:
- BC(1-8)
file4:
name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.I2.fastq.gz
base_definition:
- BC(1-8)
reference:
STAR_index: /storage1/project_raw/auto/test_with_CAR/reference/reference/star
GTF_file: /storage1/project_raw/auto/test_with_CAR/reference/genes.gtf
additional_STAR_params: '--limitSjdbInsertNsj 2000000 --clip3pAdapterSeq CTGTCTCTTATACACATCT'
additional_files: /storage1/project_raw/auto/test_with_CAR/reference/CAR.fa
out_dir: /storage1/project_raw/auto/test_with_CAR/zUMIs/smartseq3/Q2Q4
num_threads: 32
mem_limit: 100
filter_cutoffs:
BC_filter:
num_bases: 3
phred: 20
UMI_filter:
num_bases: 3
phred: 20
barcodes:
barcode_num: 96
barcode_file: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/barcode.txt
automatic: no
BarcodeBinning: 1
nReadsperCell: 100
counting_opts:
introns: yes
downsampling: '0'
strand: 0
Ham_Dist: 1
velocyto: no
primaryHit: yes
twoPass: no
make_stats: yes
which_Stage: Filtering
Rscript_exec: Rscript
STAR_exec: STAR
pigz_exec: pigz
samtools_exec: samtools

@ljy-sys
Copy link
Author

ljy-sys commented Sep 15, 2023

Thanks for this tools to help us analysis the smartseq3 data,but when i using the parameter “additional_files” to add a exogenous sequence,the result without this sequence is not a umicount of 0, other expression of other genes was consistent with that without this sequence. so, I sincerely hope to get your help, thank you.

this is my .yaml file:

project: Q2Q4 sequence_files: file1: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R1.fastq.gz base_definition: - cDNA(23-150) - UMI(12-19) find_pattern: ATTGCGCAATG file2: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R2.fastq.gz base_definition: - cDNA(1-150) file3: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.I1.fastq.gz base_definition: - BC(1-8) file4: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.I2.fastq.gz base_definition: - BC(1-8) reference: STAR_index: /storage1/project_raw/auto/test_with_CAR/reference/reference/star GTF_file: /storage1/project_raw/auto/test_with_CAR/reference/genes.gtf additional_STAR_params: '--limitSjdbInsertNsj 2000000 --clip3pAdapterSeq CTGTCTCTTATACACATCT' additional_files: /storage1/project_raw/auto/test_with_CAR/reference/CAR.fa out_dir: /storage1/project_raw/auto/test_with_CAR/zUMIs/smartseq3/Q2Q4 num_threads: 32 mem_limit: 100 filter_cutoffs: BC_filter: num_bases: 3 phred: 20 UMI_filter: num_bases: 3 phred: 20 barcodes: barcode_num: 96 barcode_file: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/barcode.txt automatic: no BarcodeBinning: 1 nReadsperCell: 100 counting_opts: introns: yes downsampling: '0' strand: 0 Ham_Dist: 1 velocyto: no primaryHit: yes twoPass: no make_stats: yes which_Stage: Filtering Rscript_exec: Rscript STAR_exec: STAR pigz_exec: pigz samtools_exec: samtools

and i got the file "additional_sequence_annot.gtf", and the geges.gtf contain the additional information.

@cziegenhain
Copy link
Collaborator

Hi,

Sorry I do not understand the issue, what do you mean with "the result without this sequence is not a umicount of 0"?

@ljy-sys
Copy link
Author

ljy-sys commented Sep 20, 2023

Hi,

Sorry I do not understand the issue, what do you mean with "the result without this sequence is not a umicount of 0"?

En, this means that the inserted sequence does not appear in umicount statistics, and the gene_names.txt file also does not contain the gene.

@cziegenhain
Copy link
Collaborator

Could you confirm that the BAM file has any aligned reads for the added sequence from the fasta file?
You could eg. run samtools idxstat on the .filtered.Aligned.GeneTagged.sorted.bam.bai for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants