Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to support new ICL task types in own codebase #848

Open
sanjari-orb opened this issue Jan 9, 2024 · 9 comments
Open

How to support new ICL task types in own codebase #848

sanjari-orb opened this issue Jan 9, 2024 · 9 comments

Comments

@sanjari-orb
Copy link

Hi, I want to add a custom ICL task type which corresponds to a new ICL metric in my codebase. Currently we have created our own patch of the training entry point file [link].

I was considering monkey patching the code for mapping the new task type to a new metric for this method However I will have to add support for a new DataLoader corresponding to the new ICL task as well.

Is there any recommendation around how I should support new ICL task types? I would prefer to not maintain my own fork of the MosaicML repo is possible.

@dakinggg
Copy link
Collaborator

dakinggg commented Jan 9, 2024

Hey @sanjari-orb, admittedly this is not the most straightforward thing to do at the moment. We're working on some refactors that would make it easier.

You will need to make your own copy of train.py, but I don't think you need to fork anything else.

I think this is what is required:
(1) copy over train.py
(2) Add a line to add your metric to the model's eval metrics (

)
(3) Copy the ICL evaluator building function into your modified train.py, and add code to create your Evaluator. To avoid a fork of composer, you may need to copy a bit of code from composer into that function as well.

@sanjari-orb
Copy link
Author

sanjari-orb commented Jan 9, 2024

However I want to be able to specify a suite of evaluation tasks, like tasks_light.yaml and add sections like

  # In-context learning evaluation
  icl_tasks:  <>
  icl_subset_num_batches: 100 # -1, or omit this key entirely, to evaluate on all batches
  eval_gauntlet: '<>'
  icl_seq_len: 1024

to my mcli finetuning yaml. Thus, I would like to reuse the preexisting datasets of tasks_light.yaml and add my own to it. I need to have 3rd party dataset-specific evaluations (eg, jeopardy, wikipedia) and this data can look very different from the train/validation split of the training loop. My understanding is that model's eval metrics (

) will not work on anything except for the validation split provided to the trainer.

@dakinggg
Copy link
Collaborator

dakinggg commented Jan 9, 2024

The eval metrics on the model are used for all Evaluators, which the ICL tasks are.

@sanjari-orb
Copy link
Author

sanjari-orb commented Jan 9, 2024

But how will the evaluator know that the new metric I introduce should only be computed on data corresponding to my new task, and not all the icl tasks? I thought that was the kind of mapping the following was doing:

def _validate_cfg(icl_cfg: DictConfig):
assert 'label' in icl_cfg
assert 'dataset_uri' in icl_cfg and icl_cfg.dataset_uri is not None
assert 'icl_task_type' in icl_cfg
assert 'num_fewshot' in icl_cfg
if 'metric_names' not in icl_cfg:
if icl_cfg.icl_task_type == 'language_modeling':
icl_cfg.metric_names = ['InContextLearningLMAccuracy']
elif icl_cfg.icl_task_type == 'multiple_choice':
icl_cfg.metric_names = [
'InContextLearningMultipleChoiceAccuracy'
]
elif icl_cfg.icl_task_type == 'schema':
icl_cfg.metric_names = [
'InContextLearningMultipleChoiceAccuracy'
]
elif icl_cfg.icl_task_type == 'question_answering':
icl_cfg.metric_names = ['InContextLearningQAAccuracy']
elif icl_cfg.icl_task_type == 'code_evaluation':
icl_cfg.metric_names = ['InContextLearningCodeEvalAccuracy']
else:
raise ValueError(
f'No metric_names defined, unable to build default metrics for icl_task_type={icl_cfg.icl_task_type}.'
)

@dakinggg
Copy link
Collaborator

dakinggg commented Jan 9, 2024

Ah, in (3) from my original message, you'll be constructing the evaluator yourself, so you can pass the metric name you want.

@sanjari-orb
Copy link
Author

I see, got it. I was patching the evaluator building code but wanted to verify if there was another way or not. Thank you for confirming!

@sanjari-orb
Copy link
Author

We're working on some refactors that would make it easier.
Is there any ETA on when the refactoring will be merged btw? It would be great to have support for adding new kinds of evaluation tasks natively.

@dakinggg
Copy link
Collaborator

dakinggg commented Jan 9, 2024

Yeah, I was just suggesting that instead of monkeypatching the library itself, you just copy it over to your train.py script, so that you only have to make changes to the launcher script.

something like

def custom_build_icl_evaluators(...):
    if task_name in my_task_names:
        call your custom code
    else:
        call the original function from the library

and sorry, I can't give an ETA on that right now.

@sanjari-orb
Copy link
Author

Understood. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants