Remove unnecessary pad token #2428

vigneshwaran · 2023-08-12T07:19:30Z

What does this PR do?

Instead of padding till max_seq_len, use the maximum length of the batch.

I have provided a simple and concise solution.

What issue(s) does this change relate to?

Fixes Fix evaluation code to improve performance #2421

dakinggg

Thanks for the PR! I'll need to double check that this works ok with FSDP and produces the same eval results as before this PR. Left one initial comment related to one of the test failures

dakinggg · 2023-08-19T01:50:40Z

composer/datasets/in_context_learning_evaluation.py

        batch = {
-            'input_ids': torch.stack(inputs),
+            'input_ids': input_ids,


I believe the test error you are getting is because input_ids and labels should not be the same tensor, because labels gets modified to put a -100 when the labels get rolled so they are aligned for the next token objective.

…ran/composer into remove_unnecessary_pad

Remove unnecessary pad token

4435d4b

vigneshwaran requested a review from a team as a code owner August 12, 2023 07:19

Merge branch 'mosaicml:dev' into remove_unnecessary_pad

0173678

vigneshwaran mentioned this pull request Aug 14, 2023

Fix evaluation code to improve performance #2421

Open

mvpatel2000 requested a review from dakinggg August 14, 2023 22:23

dakinggg reviewed Aug 19, 2023

View reviewed changes

vigneshwaran-nv-10329 added 2 commits September 8, 2023 15:06

Merge branch 'dev' into remove_unnecessary_pad

6a5882f

Merge branch 'remove_unnecessary_pad' of https://github.com/vigneshwa…

8f364f2

…ran/composer into remove_unnecessary_pad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unnecessary pad token #2428

Remove unnecessary pad token #2428

vigneshwaran commented Aug 12, 2023 •

edited

dakinggg left a comment

dakinggg Aug 19, 2023

Remove unnecessary pad token #2428

Are you sure you want to change the base?

Remove unnecessary pad token #2428

Conversation

vigneshwaran commented Aug 12, 2023 • edited

What does this PR do?

What issue(s) does this change relate to?

dakinggg left a comment

Choose a reason for hiding this comment

dakinggg Aug 19, 2023

Choose a reason for hiding this comment

vigneshwaran commented Aug 12, 2023 •

edited