Issues: ScandEval/ScandEval
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BENCHMARK DATASET REQUEST] NorBench
benchmark dataset request
Request to add a new benchmark dataset
#435
opened May 15, 2024 by
Mikeriess
1 of 8 tasks
[MODEL EVALUATION REQUEST] lightonai/alfred-40b-1023
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#434
opened May 9, 2024 by
TheLounger
4 of 8 tasks
[BUG] Generation does not terminate on single newline
bug
Something isn't working
#432
opened May 6, 2024 by
iPieter
[MODEL EVALUATION REQUEST] Phi-3-mini-4k-instruct-dansk
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#427
opened May 2, 2024 by
emillykkejensen
1 of 8 tasks
[BENCHMARK DATASET REQUEST] dutch-cola
benchmark dataset request
Request to add a new benchmark dataset
#419
opened Apr 24, 2024 by
BramVanroy
1 of 8 tasks
[FEATURE REQUEST] Support seq-to-seq architectures
enhancement
New feature or request
#418
opened Apr 24, 2024 by
saattrupdan
[BUG] Outlines version clash with vLLM
bug
Something isn't working
#414
opened Apr 23, 2024 by
saattrupdan
[BUG] Memory leak when benchmarking multiple generative models with multiple GPUs
bug
Something isn't working
#413
opened Apr 23, 2024 by
saattrupdan
[MODEL EVALUATION REQUEST] allenai/OLMo-1.7-7B-hf
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#407
opened Apr 19, 2024 by
saattrupdan
8 tasks done
Add human evaluations
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#395
opened Apr 16, 2024 by
saattrupdan
1 of 57 tasks
Support Czech
benchmark dataset request
Request to add a new benchmark dataset
#393
opened Apr 16, 2024 by
saattrupdan
[BENCHMARK DATASET REQUEST] Einbuergerungstest
benchmark dataset request
Request to add a new benchmark dataset
#392
opened Apr 15, 2024 by
saattrupdan
1 of 8 tasks
[BENCHMARK DATASET REQUEST] Inburgeringsexamen
benchmark dataset request
Request to add a new benchmark dataset
#391
opened Apr 15, 2024 by
saattrupdan
1 of 8 tasks
[BUG] NaNs in model outputs for intfloat/multilingual-e5-large-instruct
bug
Something isn't working
#389
opened Apr 15, 2024 by
saattrupdan
[MODEL EVALUATION REQUEST] intfloat/multilingual-e5-large-instruct
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
small model (<7B)
This model has less than 7B parameters, so can be evaluated on an RTX 4090 GPU or smaller.
#387
opened Apr 14, 2024 by
KennethEnevoldsen
8 tasks done
Support Spanish
benchmark dataset request
Request to add a new benchmark dataset
#386
opened Apr 12, 2024 by
saattrupdan
Support French
benchmark dataset request
Request to add a new benchmark dataset
#385
opened Apr 12, 2024 by
saattrupdan
[MODEL EVALUATION REQUEST] TheBloke/Llama-2-13B-Chat-Dutch-AWQ
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#369
opened Apr 2, 2024 by
saattrupdan
1 of 8 tasks
[MODEL EVALUATION REQUEST] TheBloke/mixtral-8x7b-v0.1-AWQ
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#368
opened Apr 2, 2024 by
saattrupdan
8 tasks done
[MODEL EVALUATION REQUEST] Jamba
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#359
opened Mar 29, 2024 by
KennethEnevoldsen
8 tasks done
[FEATURE REQUEST] Add finetuning of generative models
enhancement
New feature or request
#350
opened Mar 22, 2024 by
saattrupdan
[FEATURE REQUEST] Include errors in logs
enhancement
New feature or request
good first issue
Good for newcomers
#349
opened Mar 22, 2024 by
saattrupdan
[MODEL EVALUATION REQUEST] VAGOsolutions/SauerkrautLM-Gemma-7b
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#333
opened Mar 21, 2024 by
saattrupdan
2 of 8 tasks
[MODEL EVALUATION REQUEST] HPLT/gpt-33b-nordic-prerelease
large model (>7B)
This model has more than 7B parameters, requiring more than an RTX 4090 GPU to evaluate.
model evaluation request
Request to evaluate a model and add it to the leaderboard(s)
#307
opened Mar 21, 2024 by
saattrupdan
3 of 8 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.