Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sequence paired identity large than 100% #35

Open
Chenglin20170390 opened this issue Sep 17, 2022 · 2 comments
Open

sequence paired identity large than 100% #35

Chenglin20170390 opened this issue Sep 17, 2022 · 2 comments
Assignees

Comments

@Chenglin20170390
Copy link

Chenglin20170390 commented Sep 17, 2022

Hi , I test with follow data, but the sequence identity of matrix large than 1 . I don't know how to explan the result...
./famsa -dist_export -pid -square_matrix test.fa pid.csv
`>P
MMMMMRRRRR

T
MMMRRRRRRR
E
RRRRRRRRRR
F
RRRRRRRRRR`

output

,P,T,E,F
P,10000.000000,2.000000,0.500000,0.500000
T,2.000000,10000.000000,1.166667,1.166667
E,0.500000,1.166667,10000.000000,10000.000000
F,0.500000,1.166667,10000.000000,10000.000000
@agudys agudys self-assigned this Sep 21, 2022
@agudys
Copy link
Member

agudys commented Sep 21, 2022

@Chenglin20170390
Thank you for raporting the issue. Indeed, in -pid mode the matrix contains inverse of the dissimilarities which may result in such strange values. We will fix it in the next release to make sure that identities are from [0,1] interval.

agudys added a commit that referenced this issue Oct 5, 2022
…matching residues divided by the shorter sequence length (#35)
@agudys
Copy link
Member

agudys commented Oct 5, 2022

@Chenglin20170390
In the latest (2.2.1) version pairwise identity is calculated as the number of matching residues divided by the length of the shorter sequence. Please let me know if everything works as expected.

Adam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants