[Refactor]Add a switch for attention to return an unnormalized weight matrix. Move _get_attention_cell function position #1007

fierceX · 2019-11-14T09:35:15Z

Description

Added the option for attention to return an unnormalized weight matrix.
This will be used for tinybert training.
Moved the position of the _get_attention_cell function from seq2seq_encoder_decoder to attention_cell,which I think makes more sense.

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

cc @dmlc/gluon-nlp-team

Change _ get_attention_cell location

codecov · 2019-11-14T09:35:18Z

Codecov Report

Merging #1007 into master will decrease coverage by 14.54%.
The diff coverage is 65%.

@@             Coverage Diff             @@
##           master    #1007       +/-   ##
===========================================
- Coverage   89.93%   75.39%   -14.55%     
===========================================
  Files          67       67               
  Lines        6340     6344        +4     
===========================================
- Hits         5702     4783      -919     
- Misses        638     1561      +923

Impacted Files	Coverage Δ
src/gluonnlp/model/seq2seq_encoder_decoder.py	`81.81% <ø> (+6.43%)`	⬆️
src/gluonnlp/model/transformer.py	`91.34% <100%> (+0.02%)`	⬆️
src/gluonnlp/model/attention_cell.py	`68.85% <64.1%> (-27.38%)`	⬇️
src/gluonnlp/data/batchify/embedding.py	`17.69% <0%> (-80%)`	⬇️
src/gluonnlp/data/candidate_sampler.py	`16.32% <0%> (-75.52%)`	⬇️
src/gluonnlp/optimizer/lamb.py	`23.8% <0%> (-66.67%)`	⬇️
src/gluonnlp/utils/parallel.py	`34.04% <0%> (-63.83%)`	⬇️
src/gluonnlp/model/sequence_sampler.py	`11.49% <0%> (-63.07%)`	⬇️
src/gluonnlp/optimizer/bert_adam.py	`34.09% <0%> (-59.1%)`	⬇️
src/gluonnlp/utils/parameter.py	`23.72% <0%> (-57.63%)`	⬇️
... and 28 more

mli · 2019-11-14T10:09:53Z

Job PR-1007/1 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1007/1/index.html

leezu · 2019-11-14T10:10:33Z

Thanks @fierceX. Are you aware of #910 ? Do you have any thoughts on integrating it?

mli · 2019-11-24T09:34:37Z

Job PR-1007/2 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1007/2/index.html

mli · 2019-11-24T12:20:06Z

Job PR-1007/3 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1007/3/index.html

leezu · 2019-11-24T12:40:12Z

@fierceX ping regarding integration with #910

fierceX · 2019-11-25T13:47:41Z

@leezu Okay, I'll take a look, but the follow-up to tinybert may not be too fast, because I'm a little busy right now.

fierceX added 3 commits November 14, 2019 17:15

Add unnormalized weight matrix.

65eac37

Change _ get_attention_cell location

fix bug

6ade25c

fix bug

c4cd09e

fierceX requested a review from a team as a code owner November 14, 2019 09:35

fix pylint

77c3fe9

Merge remote-tracking branch 'nlp/master' into tinybert

88b3905

szha changed the base branch from master to v0.x August 13, 2020 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor]Add a switch for attention to return an unnormalized weight matrix. Move _get_attention_cell function position #1007

[Refactor]Add a switch for attention to return an unnormalized weight matrix. Move _get_attention_cell function position #1007

fierceX commented Nov 14, 2019

codecov bot commented Nov 14, 2019 •

edited

mli commented Nov 14, 2019

leezu commented Nov 14, 2019

mli commented Nov 24, 2019

mli commented Nov 24, 2019

leezu commented Nov 24, 2019

fierceX commented Nov 25, 2019

[Refactor]Add a switch for attention to return an unnormalized weight matrix. Move _get_attention_cell function position #1007

Are you sure you want to change the base?

[Refactor]Add a switch for attention to return an unnormalized weight matrix. Move _get_attention_cell function position #1007

Conversation

fierceX commented Nov 14, 2019

Description

Checklist

Essentials

Changes

Comments

codecov bot commented Nov 14, 2019 • edited

Codecov Report

mli commented Nov 14, 2019

leezu commented Nov 14, 2019

mli commented Nov 24, 2019

mli commented Nov 24, 2019

leezu commented Nov 24, 2019

fierceX commented Nov 25, 2019

codecov bot commented Nov 14, 2019 •

edited