Skip to content

Issues: ScandEval/ScandEval

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

[BENCHMARK DATASET REQUEST] NorBench benchmark dataset request Request to add a new benchmark dataset
#435 opened May 15, 2024 by Mikeriess
1 of 8 tasks
[MODEL EVALUATION REQUEST] lightonai/alfred-40b-1023 model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#434 opened May 9, 2024 by TheLounger
4 of 8 tasks
[BUG] Generation does not terminate on single newline bug Something isn't working
#432 opened May 6, 2024 by iPieter
[MODEL EVALUATION REQUEST] Phi-3-mini-4k-instruct-dansk model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#427 opened May 2, 2024 by emillykkejensen
1 of 8 tasks
[BENCHMARK DATASET REQUEST] dutch-cola benchmark dataset request Request to add a new benchmark dataset
#419 opened Apr 24, 2024 by BramVanroy
1 of 8 tasks
[FEATURE REQUEST] Support seq-to-seq architectures enhancement New feature or request
#418 opened Apr 24, 2024 by saattrupdan
[BUG] Outlines version clash with vLLM bug Something isn't working
#414 opened Apr 23, 2024 by saattrupdan
[MODEL EVALUATION REQUEST] allenai/OLMo-1.7-7B-hf model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#407 opened Apr 19, 2024 by saattrupdan
8 tasks done
Add human evaluations model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#395 opened Apr 16, 2024 by saattrupdan
1 of 57 tasks
Support Czech benchmark dataset request Request to add a new benchmark dataset
#393 opened Apr 16, 2024 by saattrupdan
[BENCHMARK DATASET REQUEST] Einbuergerungstest benchmark dataset request Request to add a new benchmark dataset
#392 opened Apr 15, 2024 by saattrupdan
1 of 8 tasks
[BENCHMARK DATASET REQUEST] Inburgeringsexamen benchmark dataset request Request to add a new benchmark dataset
#391 opened Apr 15, 2024 by saattrupdan
1 of 8 tasks
[MODEL EVALUATION REQUEST] intfloat/multilingual-e5-large-instruct model evaluation request Request to evaluate a model and add it to the leaderboard(s) small model (<7B) This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#387 opened Apr 14, 2024 by KennethEnevoldsen
8 tasks done
Support Spanish benchmark dataset request Request to add a new benchmark dataset
#386 opened Apr 12, 2024 by saattrupdan
Support French benchmark dataset request Request to add a new benchmark dataset
#385 opened Apr 12, 2024 by saattrupdan
[MODEL EVALUATION REQUEST] TheBloke/Llama-2-13B-Chat-Dutch-AWQ large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#369 opened Apr 2, 2024 by saattrupdan
1 of 8 tasks
[MODEL EVALUATION REQUEST] TheBloke/mixtral-8x7b-v0.1-AWQ large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#368 opened Apr 2, 2024 by saattrupdan
8 tasks done
[MODEL EVALUATION REQUEST] Jamba large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#359 opened Mar 29, 2024 by KennethEnevoldsen
8 tasks done
[FEATURE REQUEST] Add finetuning of generative models enhancement New feature or request
#350 opened Mar 22, 2024 by saattrupdan
[FEATURE REQUEST] Include errors in logs enhancement New feature or request good first issue Good for newcomers
#349 opened Mar 22, 2024 by saattrupdan
[MODEL EVALUATION REQUEST] VAGOsolutions/SauerkrautLM-Gemma-7b large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#333 opened Mar 21, 2024 by saattrupdan
2 of 8 tasks
[MODEL EVALUATION REQUEST] HPLT/gpt-33b-nordic-prerelease large model (>7B) This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate. model evaluation request Request to evaluate a model and add it to the leaderboard(s)
#307 opened Mar 21, 2024 by saattrupdan
3 of 8 tasks
ProTip! no:milestone will show everything without a milestone.