New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't you just provide ranking in a table? #437
Comments
It's sort of hard to rank though – you would have to make some choices about what recall is most important etc |
i understand that Erik but if you look anywhere, ultimately people doing
benchmarking provide a leaderboard which acts as Tl;DR e.g., see hugging
face leaderboard, GPU leaderboard, big-ann-benchmarking leaderboard etc.
that's why i said choose whatever criteria you like but provide a
leaderboard with unambiguous ranking. anyway just my 2c.
…On Wed, Jul 5, 2023 at 9:34 PM Erik Bernhardsson ***@***.***> wrote:
It's sort of hard to rank though – you would have to make some choices
about what recall is most important etc
—
Reply to this email directly, view it on GitHub
<#437 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6NWEK5EMI4VLJJYG5JCSGTXOY56FANCNFSM6AAAAAAZ7TH3DY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
the other reason i asked for leaderboard is that i tried looking at the
graphs. there is so much overlap amongst the curves that its hard to read
the graphs and make out which model is better.
…On Thu, Jul 6, 2023 at 9:54 AM Siddharth Jain ***@***.***> wrote:
i understand that Erik but if you look anywhere, ultimately people doing
benchmarking provide a leaderboard which acts as Tl;DR e.g., see hugging
face leaderboard, GPU leaderboard, big-ann-benchmarking leaderboard etc.
that's why i said choose whatever criteria you like but provide a
leaderboard with unambiguous ranking. anyway just my 2c.
On Wed, Jul 5, 2023 at 9:34 PM Erik Bernhardsson ***@***.***>
wrote:
> It's sort of hard to rank though – you would have to make some choices
> about what recall is most important etc
>
> —
> Reply to this email directly, view it on GitHub
> <#437 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/A6NWEK5EMI4VLJJYG5JCSGTXOY56FANCNFSM6AAAAAAZ7TH3DY>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
|
yeah I hear you – I've been thinking about this for a long time too, I just can't come up with something obvious. But if you have any thoughts let me know! Separately agree on the graphs being too messy. I'll try to clean that up |
one possibility is to compute the AUC - area under the curve - and use that
as the metric for the leaderboard. we can have 2 leaderboards - one for
angular and another for euclidean. choose any dataset you like.
one thing i don't understand is https://github.com/WPJiang/HWTL_SDU-ANNS
library comes out at the top in the graphs - however it has only 7 github
stars and it itself relies on ngt/qbg libraries to do the work. so why
don't ngt/qbg rank equal or better than
https://github.com/WPJiang/HWTL_SDU-ANNS? is it because of parameter tuning?
i just came across this benchmarking suite yesterday so pardon if i am
asking noob questions.
…On Thu, Jul 6, 2023 at 10:17 AM Erik Bernhardsson ***@***.***> wrote:
yeah I hear you – I've been thinking about this for a long time too, I
just can't come up with something obvious. But if you have any thoughts let
me know!
Separately agree on the graphs being too messy. I'll try to clean that up
—
Reply to this email directly, view it on GitHub
<#437 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6NWEK3JHZWMFECJWBVGWR3XO3XJTANCNFSM6AAAAAAZ7TH3DY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
ok, I didn't realize that – maybe we should remove HWTL_SDU-ANNS then? we definitely shouldn't include wrappers. I've been quite liberal accepting contributions |
This project is awesome and great but I was wondering couldn't you also provide a table with final rankings? Use whatever criteria you like but would be good to have a tabular scorecard.
on another note, it seems the best performing algorithm is qsgngt which has a mere 7 stars on github. how does one explain that?
The text was updated successfully, but these errors were encountered: