Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to acquire or calculate the mapping efficiency using methylpy? #82

Open
InfiniteLaugh opened this issue Feb 13, 2023 · 3 comments
Open

Comments

@InfiniteLaugh
Copy link

Hello, as a new learner of methylpy I am curious about how to acquire mapping efficiency, since it seems not shown in the output result in total. I used to use bismark, and the mapping efficiency is about 40-60%, which is quite low when compared with other plant species. I have looked through bowtie2 report but it seems different from this.
Therefore i try to replace it this methylpy.

my code is as follows:
methylpy paired-end-pipeline --read1-files ../samplename_clean.1.fq.gz --read2-files ../samplename_clean.2.fq.gz --sample samplename --forward-ref ../methypy_ref/genome_f --reverse-ref ../methypy_ref/genome_r --ref-fasta ../fasta --num-procs 20 --trim-reads False --compress-output True --remove-clonal True --path-to-picard /direction_to_picard/software/

my result report is as follows:

Begin splitting reads for libA
Mon Feb 13 00:52:04 2023

No trimming applied on reads
Mon Feb 13 01:08:03 2023

Begin converting reads for libA
Mon Feb 13 01:08:03 2023

Begin Running Bowtie2 for libA
Mon Feb 13 01:09:03 2023

109022310 reads; of these:
  109022310 (100.00%) were paired; of these:
    52732411 (48.37%) aligned concordantly 0 times
    30919854 (28.36%) aligned concordantly exactly 1 time
    25370045 (23.27%) aligned concordantly >1 times
51.63% overall alignment rate
Processing forward strand hits
Mon Feb 13 02:53:36 2023

109022310 reads; of these:
  109022310 (100.00%) were paired; of these:
    52872328 (48.50%) aligned concordantly 0 times
    30919684 (28.36%) aligned concordantly exactly 1 time
    25230298 (23.14%) aligned concordantly >1 times
51.50% overall alignment rate
Processing reverse strand hits
Mon Feb 13 04:42:43 2023

Bismark result is as follows:

Final Alignment report
======================
Sequence pairs analysed in total:	109022310
Number of paired-end alignments with a unique best hit:	50998056
Mapping efficiency:	46.8% 
Sequence pairs with no alignments under any condition:	46756311
Sequence pairs did not map uniquely:	11267943
Sequence pairs which were discarded because genomic sequence could not be extracted:	101

Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT:	25516794	((converted) top strand)
GA/CT/CT:	0	(complementary to (converted) top strand)
GA/CT/GA:	0	(complementary to (converted) bottom strand)
CT/GA/GA:	25481161	((converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total:	0

Thank you and looking forward to the anawer.

@yupenghe
Copy link
Owner

it is in the output file. For example:

There are 93996 total input reads
Mon May  3 22:15:54 2021

There are 88982 uniquely mapping reads, 94.6657304566 percent remaining
Mon May  3 22:15:54 2021

@InfiniteLaugh
Copy link
Author

Thank you so much for your answer. May i ask is there any way to output these information to a certain file like Bismark. For I have many sample to analyze, and currently these information were output to one file named nohup.out. Will > sample.name.txt after each line of code works?

@yupenghe
Copy link
Owner

Yes you can do something like METHYLPY COMMAND >sample_name.out.txt 2> sample_name.err.txt to get the output message and output error for each sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants