Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_sequence_output is not contextualized #264

Open
maziyarpanahi opened this issue Apr 27, 2020 · 1 comment
Open

get_sequence_output is not contextualized #264

maziyarpanahi opened this issue Apr 27, 2020 · 1 comment

Comments

@maziyarpanahi
Copy link

Hi,

I finally managed to use get_sequence_output to get word embeddings after dealing with random embeddings due to dropout, random seed, etc.

However, get_sequence_output() doesn't seem to be contextualized. If you have a string that says 'Bank river.' and get the embeddings for Bank, and try another one with Bank robber.', the embeddings for Bankis identical in both tests. In BERT and other contextualized transformers, theBank` has a different vector since the context is not the same.

I tried to play around with a mask, segments, etc. but it's always the same embeddings for a given word in different contexts. I followed the advice, some examples, etc. and the following is my configs:

xlnetConfig = XLNetConfig(FLAGS=None, json_path=json_path)
xlnetRunConfig = RunConfig(
        is_training=False,
        use_tpu=False,
        use_bfloat16=False,
        dropout=0.0,
        dropatt=0.0
)

Even though I've seen some examples using 0.1 for dropout like here, but they have random embeddings issue: https://github.com/amansrivastava17/embedding-as-service/tree/master/server/embedding_as_service/text/xlnet using

Are my XLNet config and run config correct to use the pre-trained weights/checkpoints?

@maziyarpanahi
Copy link
Author

Unfortunately, I couldn't find any solution. It seems for some reason (could be totally my mistake) the XLnet pre-trained models are not aware of their surrounding tokens. So no matter what you put before or after unlike BERT it will always generate the same vectors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant