Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[Refactor]Add a switch for attention to return an unnormalized weight matrix. Move _get_attention_cell function position #1007

Open
wants to merge 5 commits into
base: v0.x
Choose a base branch
from

Conversation

fierceX
Copy link
Member

@fierceX fierceX commented Nov 14, 2019

Description

Added the option for attention to return an unnormalized weight matrix.
This will be used for tinybert training.
Moved the position of the _get_attention_cell function from seq2seq_encoder_decoder to attention_cell,which I think makes more sense.

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

cc @dmlc/gluon-nlp-team

@fierceX fierceX requested a review from a team as a code owner November 14, 2019 09:35
@codecov
Copy link

codecov bot commented Nov 14, 2019

Codecov Report

Merging #1007 into master will decrease coverage by 14.54%.
The diff coverage is 65%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #1007       +/-   ##
===========================================
- Coverage   89.93%   75.39%   -14.55%     
===========================================
  Files          67       67               
  Lines        6340     6344        +4     
===========================================
- Hits         5702     4783      -919     
- Misses        638     1561      +923
Impacted Files Coverage Δ
src/gluonnlp/model/seq2seq_encoder_decoder.py 81.81% <ø> (+6.43%) ⬆️
src/gluonnlp/model/transformer.py 91.34% <100%> (+0.02%) ⬆️
src/gluonnlp/model/attention_cell.py 68.85% <64.1%> (-27.38%) ⬇️
src/gluonnlp/data/batchify/embedding.py 17.69% <0%> (-80%) ⬇️
src/gluonnlp/data/candidate_sampler.py 16.32% <0%> (-75.52%) ⬇️
src/gluonnlp/optimizer/lamb.py 23.8% <0%> (-66.67%) ⬇️
src/gluonnlp/utils/parallel.py 34.04% <0%> (-63.83%) ⬇️
src/gluonnlp/model/sequence_sampler.py 11.49% <0%> (-63.07%) ⬇️
src/gluonnlp/optimizer/bert_adam.py 34.09% <0%> (-59.1%) ⬇️
src/gluonnlp/utils/parameter.py 23.72% <0%> (-57.63%) ⬇️
... and 28 more

@mli
Copy link
Member

mli commented Nov 14, 2019

Job PR-1007/1 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1007/1/index.html

@leezu
Copy link
Contributor

leezu commented Nov 14, 2019

Thanks @fierceX. Are you aware of #910 ? Do you have any thoughts on integrating it?

@mli
Copy link
Member

mli commented Nov 24, 2019

Job PR-1007/2 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1007/2/index.html

@mli
Copy link
Member

mli commented Nov 24, 2019

Job PR-1007/3 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1007/3/index.html

@leezu
Copy link
Contributor

leezu commented Nov 24, 2019

@fierceX ping regarding integration with #910

@fierceX
Copy link
Member Author

fierceX commented Nov 25, 2019

@leezu Okay, I'll take a look, but the follow-up to tinybert may not be too fast, because I'm a little busy right now.

@szha szha changed the base branch from master to v0.x August 13, 2020 02:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants