Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

low align rate conflicting with bismark align rate #81

Open
yinzhuo0322 opened this issue Dec 16, 2022 · 4 comments
Open

low align rate conflicting with bismark align rate #81

yinzhuo0322 opened this issue Dec 16, 2022 · 4 comments

Comments

@yinzhuo0322
Copy link

yinzhuo0322 commented Dec 16, 2022

Hello, we encountered some problems when using methylpy to align the data of SRR6911777. The result showed that we only had an align rate which is less than 1%, while we obtained an align rate about 35% by aligning with bismark. Results of other samples were consistent with SRR6911777. This is really confusing and we hope you can give us some suggestions. Our code for methylpy is as follows:
methylpy paired-end-pipeline \ --read1-files $Read1 \ --read2-files $Read2 \ --sample $sampleid \ --forward-ref ${ref_methylpy_mm10}/mm10.genome_f \ --reverse-ref ${ref_methylpy_mm10}/mm10.genome_r \ --ref-fasta ${ref_methylpy_mm10}/mm10.genome.fa \ --path-to-output ${outdir}/methypy_${species} \ --num-procs $threads \ --path-to-picard $picard \ --path-to-samtools $samtools \ --path-to-cutadapt $cutadapt \ --remove-chr-prefix False \ --keep-temp-files True \ --trim-reads True

@yupenghe
Copy link
Owner

What was the command you used to run bismark? Can you take a quick look at the base composition of read1 and read2 fastq (you can use fastqc too)? Do the reads in read1 fastq have mostly A, T and G? Knowing these would be helpful for troubleshooting.

@yinzhuo0322
Copy link
Author

Thanks a lot for your quick reply! Here is my Bismarck code:
bismark --multicore $cores \ --fastq --non_directional --unmapped \ --nucleotide_coverage \ --path_to_bowtie $bowtie2 --bowtie2 \ --genome_folder $ref_bis \ --output_dir $outdir/bismark_${species} \ --temp_dir $outdir/bismark_${species}/temp_bismark_${sampleid} \ -1 $trim_Read1 -2 $trim_Read2
Here are the fastqc data for read1 and read2, it seems that read2 fastq have mostly A, T and G:
read1 fasqc:
截屏2022-12-16 16 57 54
read2 fasqc:
截屏2022-12-16 16 59 45

@yupenghe
Copy link
Owner

In you case, this data has the opposite pattern that bismark and methylpy expect from the data from the directional library. For methylpy we will need to use the --pbat option. Similarly bismark also has the --pbat option to handle this data. You were using --non_directional which can work but it is not ideal.

@yinzhuo0322
Copy link
Author

yinzhuo0322 commented Dec 19, 2022

Thanks for your sophisticated suggestions! After adding the --pbat option, the align rates are much more normal. We are appreciated very much about your wonderful pipeline and advice. May everything go well with you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants