Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metric_for_best_model="loss" as default in interface and add note on default metric in model card #497

Open
MoritzLaurer opened this issue Feb 9, 2024 · 10 comments

Comments

@MoritzLaurer
Copy link

Feature Request

  1. I'd suggest adding the argument "metric_for_best_model="loss"" as a default value in the interface for hyperparameters to make it clear to people that the default is the loss and to enable people to change it easily.

  2. I'd suggest adding an explicit note in the automatically generated model card about the metric which was used to choose the final model thats uploaded to the hub. This makes sure that users with less technical background / who didn't check the logs understand that the uploaded model might actually not be the most accurate/performant model, but it's only the model with the lowest loss.

Motivation

I understand that loss is a good default value given the many different possible tasks and models. At the same time, there are many tasks (like classification) where it's important not to take loss as the metric to choose the model. I'm afraid that many users will not make the effort of looking into the logs to see that autotrain might have actually trained a better model on relevant metrics. Similar for the model cards: explicitly stating which metric was used to select this model makes sure that people are aware that autotrain might have resulted in other models with better metrics on something else than loss.

Additional Context

No response

@MoritzLaurer
Copy link
Author

FYI: I just did another training run and manually specified metric_for_best_model="f1_macro" and for some reason it still selected the model with lowest loss. I'm not really sure why. Here are the training parameters I put in the UI:

{
"lr": 2e-5,
"epochs": 10,
"max_seq_length": 256,
"metric_for_best_model": "f1_macro",
"batch_size": 16,
"warmup_ratio": 0.1,
"gradient_accumulation": 1,
"optimizer": "adamw_torch",
"scheduler": "linear",
"weight_decay": 0,
"max_grad_norm": 1,
"seed": 42,
"logging_steps": -1,
"auto_find_batch_size": false,
"mixed_precision": "fp16",
"save_total_limit": 2,
"save_strategy": "epoch",
"evaluation_strategy": "epoch"
}

@abhishekkrthakur
Copy link
Member

params not available in the backend cannot be used. i can work on adding this next week :)

Copy link

github-actions bot commented Mar 1, 2024

This issue is stale because it has been open for 15 days with no activity.

@github-actions github-actions bot added the stale label Mar 1, 2024
@geegee4iee
Copy link

Hi @abhishekkrthakur, is there any update on this?

@abhishekkrthakur
Copy link
Member

Hopefully in next release

@github-actions github-actions bot removed the stale label Mar 5, 2024
Copy link

This issue is stale because it has been open for 15 days with no activity.

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link

github-actions bot commented Apr 4, 2024

This issue was closed because it has been inactive for 2 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 4, 2024
@MoritzLaurer
Copy link
Author

reopening this, but no time-pressure / immediate need from my side @abhishekkrthakur

@MoritzLaurer MoritzLaurer reopened this Apr 4, 2024
@github-actions github-actions bot removed the stale label Apr 5, 2024
Copy link

This issue is stale because it has been open for 15 days with no activity.

@github-actions github-actions bot added the stale label Apr 26, 2024
@abhishekkrthakur
Copy link
Member

open

@github-actions github-actions bot removed the stale label Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants