Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the embedding initialized with a pre-trained one? #15

Open
acadTags opened this issue Mar 21, 2018 · 2 comments
Open

Is the embedding initialized with a pre-trained one? #15

acadTags opened this issue Mar 21, 2018 · 2 comments

Comments

@acadTags
Copy link

acadTags commented Mar 21, 2018

From the code it seems the embedding is not initialized with a pre-trained embedding (i.e. word2vec), although in the paper it says so. Am I right or I missed something? Many thanks!

relevant code in _init_embedding

def _init_embedding(self, scope): #seems did not using word embedding
with tf.variable_scope(scope):
with tf.variable_scope("embedding") as scope:
self.embedding_matrix = tf.get_variable(
name="embedding_matrix",
shape=[self.vocab_size, self.embedding_size],
initializer=layers.xavier_initializer(),
dtype=tf.float32)
self.inputs_embedded = tf.nn.embedding_lookup(
self.embedding_matrix, self.inputs)

@ghazi-f
Copy link

ghazi-f commented Apr 19, 2018

The section about word embedding in the paper's paragraph 2.2:

Note that we directly use word embeddings. For
a more complete model we could use a GRU to get word vectors directly from characters, similarly to (Ling et al., 2015). We omitted this for simplicity.

So it seem this implementation tries to train word embeddings that are specific to this task, except that the code trains on a one hot representation of the words, instead of the GRU character level representation mentioned here-above.
Since the performances are lower than in the original paper, it seems that word2vec embeddings are better than the learned embeddings.
I'm currently changing the code so that it supports word2vec plugged embeddings and I'll make a pull request soon.

@ematvey
Copy link
Owner

ematvey commented Apr 24, 2018

In my experience with other language-related tasks, using pretrained embeddings doesn't make a lot of difference when dataset is sufficiently large, although I suspect it is very task and corpus-dependant.

@Sora77 would appreciate the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants