Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapq #7

Open
jelber2 opened this issue Sep 14, 2022 · 5 comments
Open

Mapq #7

jelber2 opened this issue Sep 14, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@jelber2
Copy link

jelber2 commented Sep 14, 2022

For proteins mapping to multiple contigs/chromosomes, how might one deduce the equivalent of mapping quality with miniprot? My guess is one could have a go at AS and as scores (although I am seeing ms in the resulting PAF files?)

+----+------+---------------------------------------------------+
|Tag | Type |                    Description                    |
+----+------+---------------------------------------------------+
| AS |  i   | Alignment score from dynamic programming          |
| as |  i   | Alignment score excluding introns                 |
| np |  i   | Number of amino acid matches with positive scores |
| da |  i   | Distance to the nearest start codon               |
| do |  i   | Distance to the nearest stop codon                |
| cg |  i   | Protein CIGAR                                     |
| cs |  i   | Difference string                                 |
+----+------+---------------------------------------------------+
@lh3 lh3 added the enhancement New feature or request label Sep 14, 2022
@lh3
Copy link
Owner

lh3 commented Sep 14, 2022

I will add mapping quality in future. Miniprot doesn't have it now because mapping quality is not very important for cross-species alignment.

The as in the manpage has been renamed to ms. It is roughly equivalent to the ms tag reported by minimap2. Please use this tag to estimate mapping uniqueness. AS sometimes favors pseudogenes.

@jelber2
Copy link
Author

jelber2 commented Sep 14, 2022

Thank you!

@jelber2 jelber2 closed this as completed Sep 14, 2022
@lh3
Copy link
Owner

lh3 commented Sep 14, 2022

I will keep this issue open as a reminder to myself. BTW, I have just updated the manpage to replace "as" with "ms".

@lh3 lh3 reopened this Sep 14, 2022
@conchoecia
Copy link

Just wanted to join in to say MAPQ would be a very nice addition. For example I am working with sponges, and have ~50 sponge transcriptomes that I am mapping to a new species that I am trying to annotate. For each locus in the genome it would be nice to be able to filter out poor matches based on MAPQ in the PAF line. Thanks for writing this nice piece of software, @lh3, I had been using a tblastn pipeline to perform a similar function before this.

@lh3
Copy link
Owner

lh3 commented Sep 17, 2022

MAPQ won't be very useful for filtering poor matches. You should look at score, identity and positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants