Issues introduced by refactoring and other miscellaneous #14

molereddy · 2024-03-16T02:42:11Z

Wanted to inform about some issues which cam up after merging with the latest code.

You need to add natsort to requirements.txt (needed after the refactor)
For forgetting command in the README, need to just have python forget.py --config-name=forget.yaml or provide the parameters like split, model and lr.

The text was updated successfully, but these errors were encountered:

molereddy · 2024-03-16T04:10:43Z

Also in several parts of CustomTrainerForgetting's evaluate(), self.accelerator is called without wrapping in a if self.is_deepspeed_enabled.

I don't understand deepspeed usage and haven't been able to get deepspeed running on my machine, so this might not be an actual issue.

molereddy · 2024-03-16T08:09:00Z

Another issue introduced during the refactor: in dataloader.py, The case of world_size==1 is not handled,aggregated_eval_logs remains empty when world size is 1.
Leading to an empty log dump and then an error while extracting forget quality.
File "/tofu/dataloader.py", line 297, in evaluate forget_quality = get_forget_quality(aggregated_eval_logs, retain_result) File "/tofu/utils.py", line 144, in get_forget_quality unlearn_forget_result = unlearn_result['eval_log_forget.json'] KeyError: 'eval_log_forget.json'

Things were good back when this line wasn't commented out
aggregated_eval_logs = interleave_eval_result_dict(aggregated_eval_logs, forget_rate, large_bsz=eval_cfg.batch_size, num_processes=world_size)

world_size==1 can be put in an else block

molereddy · 2024-03-18T18:21:54Z

In forget.py, local_rank is introduced in line 44 under an if os.environ.get('LOCAL_RANK') is not None: block.

It is then referenced outside any such block in line 182 if local_rank == 0:

zhilif · 2024-03-18T23:30:59Z

@molereddy Thank you so much for the reports! We are working on the fix now. I have some comments here:

python forget.py --config-name=forget.yaml is just to let hydra know that you use the forget.yaml config. You can use both forget.yaml and extra args to overwrite the default values in forget.yaml.
deepspeed is just the way we use to shard, frankly we have not tried other sharding frameworks. What type of sharding do you usually use? We could add the support.
We almost abandon the evaluate() method in dataloader.py, since we find that HF trainer execute evaluation much slower than a direct call. We haven't find the root cause yet, but we for now just call python evaluate_util.py
Let us know if you have other questions!

molereddy · 2024-03-23T21:36:14Z

I only know of deepspeed as well. It's just it hasn't been straighforward to get deepspeed setup on the machines I have available. I am currently putting that issue on the backburner, I will report again with any errors in that aspect.

molereddy · 2024-03-24T22:21:17Z

Does the recent adding of index to the dataset object change anything about the evaluation results or is it just to make debugging easy or something?

zhilif · 2024-03-29T21:37:44Z

Does the recent adding of index to the dataset object change anything about the evaluation results or is it just to make debugging easy or something?

It should make everything easier. Previously the interleaving logic only works for a very specific batch size and num_gpus, and we have a bug which causes some data duplication. The error shouldn't change anything from a statistical point of view, but it may alter the numbers. I would stay away from interleaving from now on.

molereddy · 2024-04-24T16:21:27Z

I understand it is currently deprecated for being slow, but I think the evaluate() method being updated would be very helpful.

We almost abandon the evaluate() method in dataloader.py, since we find that HF trainer execute evaluation much slower than a direct call. We haven't find the root cause yet, but we for now just call python evaluate_util.py

The evaluate() in dataloader.py is quite convenient to use as you can get results for each epoch directly alongside training instead of saving the checkpoints and evaluating each epoch after training. This is helpful when we run many experiments and don't have space to store all the checkpoints before evaluating - it is more convenient to eval the model directly instead of storing all checkpoints and evaluating.

I might have missed it if this functionality of evaluating is existing already..

I think this fix would take care of the world_size > 1 bug in dataloader's evaluate()

                else:
                    eval_logs = json.load(open(os.path.join(curr_save_dir, f"{eval_task}.json")))
                    aggregated_eval_logs[f'{eval_task}.json'] = eval_logs
                    new_save_filename = os.path.join(curr_save_dir, f"{eval_task}.json")
                    with open(new_save_filename, "w") as f:
                        # pretty write json to f
                        json.dump(eval_logs, f, indent=4)

molereddy · 2024-04-24T16:22:13Z

Also a reminder to fix the local_rank forget.py bug mentioned in the earlier comment #14 (comment)

molereddy changed the title ~~Minor fixes~~ Issues introduced by refactoring and other miscellaneous Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues introduced by refactoring and other miscellaneous #14

Issues introduced by refactoring and other miscellaneous #14

molereddy commented Mar 16, 2024 •

edited

molereddy commented Mar 16, 2024 •

edited

molereddy commented Mar 16, 2024 •

edited

molereddy commented Mar 18, 2024 •

edited

zhilif commented Mar 18, 2024

molereddy commented Mar 23, 2024

molereddy commented Mar 24, 2024

zhilif commented Mar 29, 2024

molereddy commented Apr 24, 2024 •

edited

molereddy commented Apr 24, 2024

Issues introduced by refactoring and other miscellaneous #14

Issues introduced by refactoring and other miscellaneous #14

Comments

molereddy commented Mar 16, 2024 • edited

molereddy commented Mar 16, 2024 • edited

molereddy commented Mar 16, 2024 • edited

molereddy commented Mar 18, 2024 • edited

zhilif commented Mar 18, 2024

molereddy commented Mar 23, 2024

molereddy commented Mar 24, 2024

zhilif commented Mar 29, 2024

molereddy commented Apr 24, 2024 • edited

molereddy commented Apr 24, 2024

molereddy commented Mar 16, 2024 •

edited

molereddy commented Mar 16, 2024 •

edited

molereddy commented Mar 16, 2024 •

edited

molereddy commented Mar 18, 2024 •

edited

molereddy commented Apr 24, 2024 •

edited