"Competitive Example" (2.5): Sequence-to-sequence #1391

gpengzhi · 2018-06-05T17:57:17Z

Here is a comparison between Dynet and PyTorch on the seq2seq translator example used in PyTorch tutorial (https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html).

gpengzhi · 2018-06-06T21:00:35Z

I think I find a bug in my code. I will fix it and submit the commit very soon.

neubig · 2018-06-15T15:07:25Z

Thanks! Is this fixed and ready for review?

gpengzhi · 2018-06-15T15:12:53Z

@neubig Yes. The bug is fixed.

neubig

First, thanks for contributing! I added a few comments.

Also, there are ".DS_store" files that need to be removed.

neubig · 2018-06-15T16:26:24Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+## Usage (Dynet)
+
+The architecture of the dynet model `seq2seq_dynet.py` is the same as that in [PyTorch Example](https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html). We here implement the attention mechanism in the model.


dynet -> Dynet

neubig · 2018-06-15T16:27:13Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+The architecture of the dynet model is shown as follows. 
+
+```python


I don't think this should be included in README.md, because if we make changes to the actual code they might get out of sync. Let's keep the code in .py files.

Deleted the architecture part in README.md.

neubig · 2018-06-15T16:28:18Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+Then, run the training:
+
+<pre>


Instead of using <pre> tags, can you just use 4 spaces of indent?

neubig · 2018-06-15T16:30:27Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+| Time (D) | Iteration (D) | Loss (D) | Time (P) | Iteration (P) | Loss (P)|
+| --- | --- | --- | --- | --- | --- |
+| 0m 28s | 5000 5% | 3.2687 | 1m 30s | 5000 5% | 2.8794 |


The loss of DyNet and PyTorch are quite different, even at the start. Maybe this is due to different initialization of the parameters? Could you try to match the parameter initialization strategies between the two toolkits and see if they come more in line? Because PyTorch is better, you could try to match the PyTorch initialization strategy with DyNet.

3.2687 and 2.8794 are actually the loss after 5000 iterations for both Dynet and Pytorch. I listed the loss at the start of the training procedure for both Dynet example and PyTorch example. The initial loss of DyNet is 7.9808, and the initial loss of PyTorch is 7.9615.

pmichel31415 · 2018-09-04T09:59:06Z

Hi @gpengzhi , thanks for the PR! Do you think you'll have time to fix the lint errors?

gpengzhi · 2018-09-07T20:23:40Z

Hi, @pmichel31415 I have already fixed the lint errors.

pmichel31415

Thanks for the great contribution @gpengzhi ! I have a few comments here and there can you take a look? The main points of contention are:

Big discrepancy in loss between dynet/pytorch, can you check that the hyperparameters are as close as possible in both cases (eg the learning rate is different)
Add docstrings so that people can navigate the code more easily if they want to reuse it
Change function names to snake_case

pmichel31415 · 2018-09-07T21:26:54Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+| 8m 12s | 85000 85% | 0.7621 | 21m 44s | 85000 85% | 0.5210 |
+| 8m 41s | 90000 90% | 0.7453 | 22m 55s | 90000 90% | 0.5054 |
+| 9m 10s | 95000 95% | 0.6795 | 24m 9s | 95000 95% | 0.4417 |
+| 9m 39s | 100000 100% | 0.6442 | 25m 24s | 100000 100% | 0.4297 |


This big difference is still weird to me. Can we make these numbers similar so that dynet is competitive with pytorch?

For instance I've noticed that there are some differences in the optimization procedure (dynet uses learning rate 0.2 vs pytorch has learning rate 0.01 if I'm not mistaken).

Also do you have any intuition on why there is such a big speed difference? Do you think it is due to your setup or the differences in implementation?

pmichel31415 · 2018-09-07T21:35:58Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+Here is the comparison between Dynet and PyTorch on the [seq2seq translator example](https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html).
+
+The data we used is a set of many thousands of English to French translation pairs. Download the data from [here](https://download.pytorch.org/tutorial/data.zip) and extract it to the current directory.


Can you credit the original source of the data like in the pytorch tutorial: https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html#loading-data-files

pmichel31415 · 2018-09-07T21:37:14Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+Format:
+
+<pre>


<pre> -> ```

pmichel31415 · 2018-09-07T21:37:21Z

examples/sequence-to-sequence/seq2seq_translator/README.md

+
+> vous etes tellement belle dans cette robe !
+= you re so beautiful in that dress .
+< you re so beautiful in that dress . <EOS>


Why not have the same examples for pytorch and dynet? For comparison.

pmichel31415 · 2018-09-07T21:38:05Z

examples/sequence-to-sequence/seq2seq_translator/seq2seq_dynet.py

+EOS_token = 1
+
+
+class Lang(object):


Can you add a small docstring for each class/method/function? So that it is easier for people who want to reuse the code?

pmichel31415 · 2018-09-07T21:40:44Z

examples/sequence-to-sequence/seq2seq_translator/seq2seq_dynet.py

+        self.index2word = {0: "SOS", 1: "EOS"}
+        self.n_words = 2
+
+    def addSentence(self, sentence):


Can you use snake_case for function/method/variable names (but not class names) to be consistent with the dynet API. You should probably do it for the pytorch example as well.

pmichel31415 · 2018-09-07T21:42:34Z

examples/sequence-to-sequence/seq2seq_translator/seq2seq_pytorch.py

+EOS_token = 1
+
+
+class Lang:


Can you put this class and all the other utilities that are not dependent on the framework in a separate file like utils.py? To avoid code duplication

pmichel31415 · 2018-09-07T21:43:20Z

examples/sequence-to-sequence/seq2seq_translator/seq2seq_dynet.py

+import dynet as dy
+import time
+import math
+r = random.SystemRandom()


Can you fix the random seed (for reproducibility). Please also fix the dynet/pytorch random seeds as well

pmichel31415 · 2018-09-07T21:45:30Z

examples/sequence-to-sequence/seq2seq_translator/seq2seq_pytorch.py

+
+    for _ in range(n):
+        pair = r.choice(pairs)
+        print('>', pair[0])


This is a very nitpicky comment but can you use " or ' consistently everywhere for strings? A simple search and replace should fix it. I'm partial to ".

gpengzhi added 2 commits June 5, 2018 13:45

add seq2seq translator example

8ec515b

update README.md

6c4000b

gpengzhi mentioned this pull request Jun 5, 2018

"Competitive Example" (2.5): Sequence-to-sequence #1292

Open

gpengzhi added 2 commits June 6, 2018 17:44

bug fixed in the model

63f63c5

delete unrelated files

11b2a39

neubig reviewed Jun 15, 2018

View reviewed changes

gpengzhi and others added 7 commits June 22, 2018 16:48

address the comments

f77ae00

address the comments

e46af58

Delete .DS_Store

ad04e2d

Delete .DS_Store

5c792fc

Delete .DS_Store

e490502

Delete .DS_Store

28f7b9b

Update README.md

242d241

gpengzhi added 3 commits September 4, 2018 10:54

fix lint error

aa6284a

fix lint error

d9619cf

fix lint error

177aa61

pmichel31415 requested changes Sep 7, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Competitive Example" (2.5): Sequence-to-sequence #1391

"Competitive Example" (2.5): Sequence-to-sequence #1391

gpengzhi commented Jun 5, 2018

gpengzhi commented Jun 6, 2018

neubig commented Jun 15, 2018

gpengzhi commented Jun 15, 2018

neubig left a comment

neubig Jun 15, 2018

gpengzhi Jun 22, 2018

neubig Jun 15, 2018

gpengzhi Jun 22, 2018

neubig Jun 15, 2018

gpengzhi Jun 22, 2018

neubig Jun 15, 2018

gpengzhi Jun 22, 2018

pmichel31415 commented Sep 4, 2018

gpengzhi commented Sep 7, 2018

pmichel31415 left a comment

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018

pmichel31415 Sep 7, 2018


		## Usage (Dynet)

		The architecture of the dynet model `seq2seq_dynet.py` is the same as that in [PyTorch Example](https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html). We here implement the attention mechanism in the model.


		The architecture of the dynet model is shown as follows.

		```python


		Here is the comparison between Dynet and PyTorch on the [seq2seq translator example](https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html).

		The data we used is a set of many thousands of English to French translation pairs. Download the data from [here](https://download.pytorch.org/tutorial/data.zip) and extract it to the current directory.


		Format:

		<pre>

"Competitive Example" (2.5): Sequence-to-sequence #1391

Are you sure you want to change the base?

"Competitive Example" (2.5): Sequence-to-sequence #1391

Conversation

gpengzhi commented Jun 5, 2018

gpengzhi commented Jun 6, 2018

neubig commented Jun 15, 2018

gpengzhi commented Jun 15, 2018

neubig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmichel31415 commented Sep 4, 2018

gpengzhi commented Sep 7, 2018

pmichel31415 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment