Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental Tag for benchmarks #171

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

nv-rborkar
Copy link
Contributor

@nv-rborkar nv-rborkar commented Apr 2, 2024

Agile tag is a way for MLPerf to stay agile & make early bets on upcoming hot benchmarks.
It allows adopting viral benchmarks while also providing a way to update, refresh or tweak them as the ML landscape changes quickly instead of getting locked-in to a 2 year cadence.

Ideally all benchmarks should have the agility to refresh if landscape warrants faster change e.g. update the sequence length of existing LLM models or tweak outdated architectures but we should also balance the churn to reduce submitter burden and prolong investment !/$.

@nv-rborkar nv-rborkar requested a review from a team as a code owner April 2, 2024 21:02
Copy link

github-actions bot commented Apr 2, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Also added minimum lifetime of agile to be 1 year based on discussion in WG.
@nv-rborkar
Copy link
Contributor Author

4/25: Training WG agrees with this proposal

@itayhubara
Copy link

itayhubara commented May 9, 2024

Two comments:

  1. I cannot find the "expected lifetime of 4 rounds" rule anywhere - @nv-rborkar can you please point me to that line
  2. I believe that changing the rule to allow one agile (instead of two) per round should be enough (we don't usually have two "real agile" models that we wish to add per year)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants