Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to add new token to vocabulary and only fintune these embedings? #383

Open
CongHan0808 opened this issue Mar 5, 2024 · 0 comments
Open

Comments

@CongHan0808
Copy link

I follow #155, add 100 new tokens to the vocabulary and corresponding embedings. I try to only fintune these embeddings and fix raw tokens with pretrained weights. There are my code:

model_engine.backward(total_loss)        
if args.nums_token and args.mulgpu_numtoken and args.token_detach:
    textembeds_masks = torch.zeros_like(model_engine.in_adaptor.text_embed.weight).to(device=model_engine.local_rank)
    # textembeds_masks[VOCAB_SIZE_SRC+1,:] = 1
    with torch.no_grad():
        for p_name,param in model_engine.named_parameters():
            if "llm_model.base_model.model.model.embed_tokens.weight" in p_name:
                if param.grad is not None:
                    param.grad.copy_(param.grad.data * textembeds_masks)
            if "in_adaptor.text_embed.weight" in p_name:
                if param.grad is not None:
                    param.grad.copy_(param.grad.data * textembeds_masks)
model_engine.step()

in_adaptor.text_embed.weight is initialized by llm_model.base_model.model.model.embed_tokens.weight. After some checkpoints, the raw tokens' weights of in_adaptor.text_embed.weight in different cks are different. How should I change my code to keep the raw tokens' weights the same and only fintune the new tokens' weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant