Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interested in your paper #832

Open
hero007feng opened this issue Nov 18, 2019 · 1 comment
Open

Interested in your paper #832

hero007feng opened this issue Nov 18, 2019 · 1 comment

Comments

@hero007feng
Copy link

I'm interested in your paper -- 'Input Combination Strategies for Multi-Source Transformer Decoder', Would you mind telling me how can I reproduce this work. I want cite this paper. Thanks

@toshohirasawa
Copy link

+1. I'm also working on the reproduction of the multimodal model with a parallel attention strategy, while there are some difficulties to make it done. I hope to hear from the authors.

  • Handling WordPieace, including how to extract/save subtokens (by using t2t?) and how to compose the INI files.
  • Feature extraction: I'm not sure how to write an INI file that uses the feature from theresnet_v2_50/block4/unit_3/bottleneck_v2/conv3 sublayer in ResNet_v2_50. An INI file mimicking the INI files in test examples gives OOM error.

Here is my INI file:

;; Multimodal Transformer with a parallel attention

[main]
name="transformer"
tf_manager=<tf_manager>
output="examples/output/parallel"
overwrite_output_dir=True
batch_size=32
epochs=1000
train_dataset=<train_data>
val_dataset=<val_data>
trainer=<trainer>
runners=[<runner>]
evaluation=[("target", evaluators.BLEU), ("target_greedy", "target", evaluators.BLEU)]
logging_period=100
validation_period=1000
random_seed=1234

[tf_manager]
class=tf_manager.TensorFlowManager
num_sessions=1
num_threads=4

[image_reader]
class=readers.image_reader.imagenet_reader
prefix="/flickr30k-images"
target_width=224
target_height=224
zero_one_normalization=True

[train_data]
class=dataset.load
series=["source", "target", "images", "source_bpe", "target_bpe"]
data=["examples/data/translation/train.en", "examples/data/translation/train.de", ("examples/data/translation/train_images.txt", <image_reader>), (<wp_preprocess>, "source"), (<wp_preprocess>, "target")]
[val_data]
class=dataset.load
series=["source", "target", "images", "source_bpe", "target_bpe"]
data=["examples/data/translation/val.en", "examples/data/translation/val.de", ("examples/data/translation/val_images.txt", <image_reader>), (<wp_preprocess>, "source"), (<wp_preprocess>, "target")]

[wp_preprocess]
class=processors.wordpiece.WordpiecePreprocessor
vocabulary=<vocabulary>

[vocabulary]
class=vocabulary.from_wordlist
path="examples/data/translation/wordpieces.clean"
contains_header=False
contains_frequencies=False

[inpseq]
class=model.sequence.EmbeddedSequence
name="input"
embedding_size=256
max_length=50
data_id="source_bpe"
vocabulary=<vocabulary>

[encoder]
class=encoders.transformer.TransformerEncoder
name="text_encoder"
input_sequence=<inpseq>
ff_hidden_size=2048
depth=6
n_heads=8
dropout_keep_prob=0.7

[imagenet]
class=encoders.imagenet_encoder.ImageNet
name="imagenet_resnet"
data_id="images"
network_type="resnet_v2_50"
spatial_layer="resnet_v2_50/block4/unit_3/bottleneck_v2/conv3"
slim_models_path="lib/models/research/slim"

[decoder]
class=decoders.transformer.TransformerDecoder
name="decoder"
encoders=[<encoder>,<imagenet>]
dropout_keep_prob=0.5
data_id="target_bpe"
max_output_len=50
vocabulary=<vocabulary>
embedding_size=256
ff_hidden_size=2048
depth=6
n_heads_self=8
n_heads_enc=8
attention_combination_strategy="parallel"

[trainer]
class=trainers.delayed_update_trainer.DelayedUpdateTrainer
batches_per_update=5
l2_weight=1.0e-8
clip_norm=1.0
objectives=[<obj>]
optimizer=<lazyadam_g>

[obj]
class=trainers.cross_entropy_trainer.CostObjective
decoder=<decoder>

[lazyadam_g]
class=tf.contrib.opt.LazyAdamOptimizer
beta1=0.9
beta2=0.98
epsilon=1.0e-9
learning_rate=<decayed_lr>

[decayed_lr]
class=functions.noam_decay
learning_rate=0.2
model_dimension=6
warmup_steps=111

[runner]
class=runners.GreedyRunner
decoder=<decoder>
postprocess=processors.wordpiece.WordpiecePostprocessor
output_series="target_greedy"

and the wordlist file of subtokens is like:

<pad>
<s>
</s>
<unk>
.
a
in
ein
einem
,
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants