Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attentive interface #802

Open
wants to merge 18 commits into
base: autoregressive_refactor
Choose a base branch
from
Open

Conversation

varisd
Copy link
Member

@varisd varisd commented Mar 15, 2019

Moved attention-related attributes/methods to a separate class Attentive. Every decoder that requires computing attention against the encoders should inherit this class.

Copy link
Member

@jindrahelcl jindrahelcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jak se k tomu má encoders/attentive.py ? Jedno z toho bych přejmenoval (spíš ten enkodér) a taky bych tomu enkodéru oddědil věci, který jsou tady vyčleněný, pokud to dává smysl.

@jlibovicky
Copy link
Contributor

A rekuretní dekodér není attentive?

@varisd
Copy link
Member Author

varisd commented Mar 15, 2019

Rekurentni je attentive, ale ten tedka predelavam v dalsim PR, tak aby pouzival stejnou implementaci enkoder attentionu jako transformer.
V budoucnu to bude chtit doresit attention_histories a nebo to zpetne narobovat na ty attention objekty.

@varisd
Copy link
Member Author

varisd commented Mar 15, 2019

Jak se k tomu má encoders/attentive.py ? Jedno z toho bych přejmenoval (spíš ten enkodér) a taky bych tomu enkodéru oddědil věci, který jsou tady vyčleněný, pokud to dává smysl.

Vubec nesouvisi. Ono je to tezky: Ondrovo attention_heads jsou neco jineho nez heads v multi_head_attentionu. Nejbliz by se to dalo popsat jako: Ondra udela klasik self-attention (kde queries je prazdny; nedela scaled_dot_product, ale projekce, tak ze to jde), pak dane vektory secte v case a nakonec reshapne z shape(t) na shape(t/h, h).

Tezky je na tom to, ze bychom museli zmenit jeho terminy, aby to nevnaselo zmatek, ale pak by to tolik nesedelo na terminy v tom jeho clanku.

Nechal bych to jako TODO pro refactor attentionu / attention objektu.

@jlibovicky
Copy link
Contributor

IMHO stačí přejmenovat ten Ondrovo soubor nebo třídu. Něco jako self-attentive temporal encoder nebo tak něco.

@varisd
Copy link
Member Author

varisd commented Mar 15, 2019

Jo, neco vymyslim.

Edit: Prejmenovano na Structured

Copy link
Member

@jindrahelcl jindrahelcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Napřed prosim vyřešit vztah k #786 plus poznámky, který jsem dal k tomu, rebasnout a pak na to mrknu ještě jednou

@jlibovicky
Copy link
Contributor

Nejvdřív je potřeba pořešit #796, na kterým je tohleto naroubovaný.

@jindrahelcl
Copy link
Member

Takže #796 je na #786 a tohle je na obojím? :-)

@jlibovicky
Copy link
Contributor

#796 je nezávsilý. #786 z hodo jde vyrebasovat pryč jednoduše.

@jlibovicky jlibovicky force-pushed the autoregressive_refactor branch 2 times, most recently from 3e5ab5a to 89eb9c2 Compare March 26, 2019 09:36
@varisd varisd force-pushed the autoregressive_refactor branch 2 times, most recently from 6ed29a5 to c446c9c Compare April 1, 2019 15:13
@varisd
Copy link
Member Author

varisd commented May 2, 2019

ping
opravil jsem tests/classifier, tak ted uz by to snad melo byt vse, ne?

@@ -95,6 +95,84 @@ def __init__(self,
# pylint: enable=too-few-public-methods


def _bucket_boundaries(max_length, min_length=8, length_bucket_step=1.1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typy

@@ -0,0 +1,167 @@
"""TODO."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

the attention.

TODO:
Generalize the attention.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

co je tohle todo?

queries, enc_states, enc_masks, self.n_heads_enc,
self.n_heads_hier, attn_dropout_cbs, dropout_cb)

# TODO: remove this - this is already checked in the constructor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odstranit? nebo smazat todo?

@@ -132,7 +132,7 @@ def __init__(self,
postprocess: The postprocessor to apply to the output data.
"""
check_argument_types()
BaseRunner[BeamSearchDecoder].__init__(self, output_series, decoder)
super().__init__(output_series, decoder)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tyhle změny už jsem dělal a jsou zamergovaný, proč to tu je? je to narebasovaný správně?

@@ -9,5 +9,5 @@ python_speech_features
pygments
rouge==0.2.1
typeguard
sacrebleu
sacrebleu==1.3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to tu nemá bejt. nová verze sacrebleu už je kompatibilní - byl to jejich bug

@@ -9,5 +9,5 @@ python_speech_features
pygments
rouge==0.2.1
typeguard
sacrebleu
sacrebleu==1.3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -1 +1 @@
mypy
mypy==0.660
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pryč

@jindrahelcl
Copy link
Member

Vypadá to, že je to špatně narebasovaný, protože se jako součást ukazuje tvuj PR (#818) řešící to, co jsem udělal v #819

@jindrahelcl jindrahelcl mentioned this pull request May 9, 2019
@varisd
Copy link
Member Author

varisd commented May 9, 2019

Rebase opraven.

@jindrahelcl
Copy link
Member

ten pull request je nastavenej na mergování do branche autoregressive_refactor. Nechceš mergovat do mastera?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants