Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to train HAN because of the following error. #32

Open
kk54709 opened this issue Oct 14, 2018 · 8 comments
Open

Not able to train HAN because of the following error. #32

kk54709 opened this issue Oct 14, 2018 · 8 comments

Comments

@kk54709
Copy link

kk54709 commented Oct 14, 2018

sentence_input = Input(shape=(MAX_SENT_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sentence_input)
l_lstm = Bidirectional(GRU(100, return_sequences=True))(embedded_sequences)
l_att = AttLayer(100)(l_lstm)
sentEncoder = Model(sentence_input, l_att)

review_input = Input(shape=(MAX_SENTS, MAX_SENT_LENGTH), dtype='int32')
review_encoder = TimeDistributed(sentEncoder)(review_input)
l_lstm_sent = Bidirectional(GRU(100, return_sequences=True))(review_encoder)
l_att_sent = AttLayer(100)(l_lstm_sent)
preds = Dense(2, activation='softmax')(l_att_sent)
model = Model(review_input, preds)

model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])

print("model fitting - Hierachical attention network")

Error
ValueError: Dimensions must be equal, but are 15 and 100 for 'att_layer_10/mul' (op: 'Mul') with input shapes: [?,15], [?,15,100].

@cjopengler
Copy link

I got the same error

@cjopengler
Copy link

I found that in AttLayer change code from

" def compute_mask(self, inputs, mask=None):
return mask
"
to

def compute_mask(self, inputs, mask=None):
    return None

it will be ok. It means that should not support mask.

@kk54709
Copy link
Author

kk54709 commented Oct 15, 2018

Solved the issue.
Thanks @cjopengler .

@cjopengler
Copy link

Solved the issue.
Thanks @cjopengler .

But have you found that 'l_att = AttLayer(100)(l_lstm)' the first AttLayer no error, but the second 'l_att_sent = AttLayer(100)(l_lstm_sent)' at computing mask.

@kk54709
Copy link
Author

kk54709 commented Oct 15, 2018

Well, I'm still experimenting with it. But earlier I was facing some issue with the same and now it is working fine.

@MingleiLI
Copy link

I have the same problem and changing the code as @cjopengler says works.

@mingkin
Copy link

mingkin commented Feb 13, 2019

Solved the issue.
Thanks @cjopengler

@980202006
Copy link

it is because of the function timedistributed
previous_mask
<tf.Tensor 'time_distributed_14/Reshape_2:0' shape=(?, 15, 100) dtype=bool>
the methods of resolution can be refer to https://blog.csdn.net/songbinxu/article/details/80242211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants