Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is trained embedding orthogonal? #53

Open
Peacer68 opened this issue Jun 28, 2023 · 1 comment
Open

Why is trained embedding orthogonal? #53

Peacer68 opened this issue Jun 28, 2023 · 1 comment

Comments

@Peacer68
Copy link

I load the model ema_0.9999_050000.pt shared by you (thanks for sharing), and find the word_embedding.weight is orthogonal, which is wierd! This means trained embedding failing to learn semantic relevance between words, and it just seperates words far away to tolerate the generation error.
Here is my code:

pt_path = '/home/workarea/Diffusion/DiffuSeq-Fork/diffusion_models/diffuseq_qqp_h128_lr0.0001_t2000_sqrt_lossaware_seed102_test_ori20221113-20_27_29/ema_0.9999_050000.pt'
s = torch.load(pt_path, map_location=torch.device('cpu'))
weight = s['word_embedding.weight']
mm = torch.softmax(torch.mm(weight, weight.transpose(0,1)), dim=-1)
print(mm.trace()/mm.size(0))
# the result is 1!

Could you please explain this phenomenon? Thanks a lot!

@xiaotingxuan
Copy link

Hi , I am also confused about that. When I visualize the trained embedding and bert embedding,I find its very different.
here is trained embedding,it looks like a gaussian distribution(I visualize gaussian noise ,it looks very similar with the following picture)
capdiffusion_embed的副本

here is bert embedding (load from pretrained bert-base-uncased)

bert_embed
I notice that some papers said using bert embedding is useless for diffusion,a learnable embedding is better. Why pretrained bert embedding is useless,is it because it's distribution is different?Why a learnable embedding is better, after the training process, it still fails to learn semantic relevance between words ,Hope someone can give some advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants