Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is bert/encoder and bert_1/encoder for retriever model? #152

Open
xhluca opened this issue Feb 10, 2022 · 5 comments
Open

What is bert/encoder and bert_1/encoder for retriever model? #152

xhluca opened this issue Feb 10, 2022 · 5 comments

Comments

@xhluca
Copy link

xhluca commented Feb 10, 2022

I needed to convert the models to torch, but I can't figure out whether bert/encoder corresponds to the question encoder and bert_1/encoder to the table encoder, or the other way around.

The variable names can be found when you directly read the ckpt file:

from tensorflow.python.training import py_checkpoint_reader

# Need to say "model.ckpt" instead of "model.ckpt.index" for tf v2
file_name = "./tapas_nq_hn_retriever_medium/model.ckpt"
reader = py_checkpoint_reader.NewCheckpointReader(file_name)

# Load dictionaries var -> shape and var -> dtype
var_to_shape_map = reader.get_variable_to_shape_map()
var_to_dtype_map = reader.get_variable_to_dtype_map()


for k in var_to_dtype_map.keys():
    print(k)

Which gives you the following:

text_projection
table_projection/adam_v
bert_1/pooler/dense/kernel/adam_v
bert_1/pooler/dense/kernel/adam_m
bert_1/pooler/dense/kernel
...
bert/encoder/layer_6/attention/self/key/bias/adam_m
bert/encoder/layer_6/attention/output/dense/bias
bert_1/encoder/layer_2/attention/self/value/kernel
...

It is not clear from reading the code which one is the question encoder, which makes it difficult to use the model on a different dataset.

@xhluca
Copy link
Author

xhluca commented Feb 10, 2022

@eisenjulian

@SomewhereInTheVastUniverse

@xhluca
Hi, I'm a person who want to convert tensorflow retriever model to pytorch model for testing. I wonder if this work has been completed. and if so, could you share the model for me? Thank you.

@xhluca
Copy link
Author

xhluca commented Apr 5, 2024

Sorry I'm not sure, id recommend taking a look at the Huggingface repo.

@SomewhereInTheVastUniverse

@xhluca
I found it in huggingface. Thank you. Do you know how to inferece and train the model? There is no description for it.

@xhluca
Copy link
Author

xhluca commented Apr 9, 2024

I am not aware of it. I just finetune it using in-batch negative sampling following papers like DPR (a lot of examples by huggingface on how to do that), there might be ways to finetune TAPAS specifically but I am not aware of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants