[BUG] CausalLanguageModeling do not mask last input item #765

sungho-ham · 2024-01-05T07:05:38Z

Bug description

The clm masking for last item only do not mask last item in input.
It will cause using the embedding of the label instead of mask.

I think following code needs to be fixed.

Transformers4Rec/transformers4rec/torch/masking.py

Line 298 in 348c963

mask_labels = item_ids != self.padding_idx

Steps/Code to reproduce bug

import torch
from transformers4rec.torch import masking

item_ids = torch.tensor([[1, 2, 0], ])
mask = masking.CausalLanguageModeling(hidden_size=10, train_on_last_item_seq_only=True)
masking_info = mask.compute_masked_targets(item_ids, training=True)
print(masking_info)

MaskingInfo(schema=tensor([[ True,  True, False]]), targets=tensor([[2, 0, 0]]))

Expected behavior

MaskingInfo(schema=tensor([[ True,  False, False]]), targets=tensor([[2, 0, 0]]))

Environment details

Transformers4Rec version: 23.08.00

Additional context

The text was updated successfully, but these errors were encountered:

dcy0577 · 2024-01-24T15:39:26Z

I think this line of code need to be removed:

Transformers4Rec/transformers4rec/torch/masking.py

Line 298 in d0cce61

mask_labels = item_ids != self.padding_idx

As solution just use the mask_label from predict_all() above.

And I think the reason why current code somehow works is because of this part:

Transformers4Rec/transformers4rec/torch/masking.py

Lines 318 to 337 in d0cce61

    
           # shift sequence of interaction embeddings 
        
           pos_emb_inp = inputs[:, :-1] 
        
           # Adding a masked item in the sequence to return to the initial sequence. 
        
           pos_emb_inp = torch.cat(  # type: ignore 
        
               [ 
        
                   pos_emb_inp, 
        
                   torch.zeros( 
        
                       (pos_emb_inp.shape[0], 1, pos_emb_inp.shape[2]), 
        
                       dtype=pos_emb_inp.dtype, 
        
                   ).to(inputs.device), 
        
               ], 
        
               axis=1, 
        
           ) 
        
           # Replacing the inputs corresponding to padded items with a trainable embedding 
        
           pos_emb_inp = torch.where( 
        
               mask_schema.unsqueeze(-1).bool(), 
        
               pos_emb_inp, 
        
               self.masked_item_embedding.to(pos_emb_inp.dtype), 
        
           ) 
        
           return pos_emb_inp

Given input sequence without padding [1,2,3], the mask schema generated by current code during evaluation will be [True, True, True], which exposes the last item. However the apply_mask_to_inputs will replace the last item with 0 embedding. And since the schema are all True, no mask embedding will be applied on input. I think in this case 0 embedding sort of plays a role as mask.
However, when input has padding like [1,2,3,0,0], the current mask schema will be [True, True, True, False, False]. And because the last item is a padding item, the apply_mask_to_inputs basically replaces the padding with 0 embedding. Then the mask schema comes in, masks the last 2 padding items, keeping the 1,2,3 visible to transformer.
I think thats why people encounter issues testing clm. If there are always paddings in input data, the evaluation metrics would be unrealistically high.

jian-mo · 2024-01-30T16:35:40Z

I also noticed this bug as well. After the fix, the recall is down about 20% less

zhouyu5 · 2024-04-02T08:54:28Z

Any further updates? It seems #723 still not solve this bug.

sungho-ham added bug Something isn't working status/needs-triage labels Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] CausalLanguageModeling do not mask last input item #765

[BUG] CausalLanguageModeling do not mask last input item #765

sungho-ham commented Jan 5, 2024

dcy0577 commented Jan 24, 2024

jian-mo commented Jan 30, 2024 •

edited

zhouyu5 commented Apr 2, 2024 •

edited

[BUG] CausalLanguageModeling do not mask last input item #765

[BUG] CausalLanguageModeling do not mask last input item #765

Comments

sungho-ham commented Jan 5, 2024

Bug description

Steps/Code to reproduce bug

Expected behavior

Environment details

Additional context

dcy0577 commented Jan 24, 2024

jian-mo commented Jan 30, 2024 • edited

zhouyu5 commented Apr 2, 2024 • edited

jian-mo commented Jan 30, 2024 •

edited

zhouyu5 commented Apr 2, 2024 •

edited