Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changing eval to include lm_harness args #74

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

changing eval to include lm_harness args #74

wants to merge 4 commits into from

Conversation

veekaybee
Copy link
Member

What's changing

How to test it

Related Jira Ticket

Additional notes for reviewers

print(f"Received job configuration:\n {config.model_dump_json(indent=2)}")

if config.tracking is not None:
with wandb_init_from_config(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still use our method wandb_init_from_config here, as it controls the naming convention for job types, and the resume argument.

If this is performed outside of the WandbLogger class, it will use the active run internally: https://github.com/EleutherAI/lm-evaluation-harness/blob/b177c82cb3ef4a8f03ff77341970ab9d77ffadb4/lm_eval/logging_utils.py#L99

However, we should remove the parameters=config.evaluator argument since logging the eval parameters is handled internally by the WandbLogger.

@@ -82,7 +78,7 @@ def load_harness_model(
raise ValueError(f"Unexpected model config type: {type(config.model)}")


def load_and_evaluate(
def evaluate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert the name here, or choose a different descriptive name? There is more than happening in this method than just evaluate.

Also, I think the results returned by this method are different than the results expected by wandb_logger.post_init. We may not need the get_numeric_metrics helper method above anymore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I see that you stopped calling get_numeric_metrics so the helper method above can be deleted

@@ -4,6 +4,7 @@
import torch
from lm_eval.models.huggingface import HFLM
from lm_eval.models.openai_completions import OpenaiCompletionsLM
from lm_eval.logging_utils import WandbLogger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a version bump on lm-eval package to include this? Should update the pyproject.toml file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants