You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I follow #155, add 100 new tokens to the vocabulary and corresponding embedings. I try to only fintune these embeddings and fix raw tokens with pretrained weights. There are my code:
model_engine.backward(total_loss)
if args.nums_token and args.mulgpu_numtoken and args.token_detach:
textembeds_masks = torch.zeros_like(model_engine.in_adaptor.text_embed.weight).to(device=model_engine.local_rank)
# textembeds_masks[VOCAB_SIZE_SRC+1,:] = 1
with torch.no_grad():
for p_name,param in model_engine.named_parameters():
if "llm_model.base_model.model.model.embed_tokens.weight" in p_name:
if param.grad is not None:
param.grad.copy_(param.grad.data * textembeds_masks)
if "in_adaptor.text_embed.weight" in p_name:
if param.grad is not None:
param.grad.copy_(param.grad.data * textembeds_masks)
model_engine.step()
in_adaptor.text_embed.weight is initialized by llm_model.base_model.model.model.embed_tokens.weight. After some checkpoints, the raw tokens' weights of in_adaptor.text_embed.weight in different cks are different. How should I change my code to keep the raw tokens' weights the same and only fintune the new tokens' weights.
The text was updated successfully, but these errors were encountered:
I follow #155, add 100 new tokens to the vocabulary and corresponding embedings. I try to only fintune these embeddings and fix raw tokens with pretrained weights. There are my code:
in_adaptor.text_embed.weight
is initialized byllm_model.base_model.model.model.embed_tokens.weight
. After some checkpoints, the raw tokens' weights ofin_adaptor.text_embed.weight
in different cks are different. How should I change my code to keep the raw tokens' weights the same and only fintune the new tokens' weights.The text was updated successfully, but these errors were encountered: