Audio Visual Scene-aware Dialog (AVSD) example for DSTC7

This is an implementation of the naive fusion of audio and video features targeted at the AVSD Challenge at DSTC7. Details of our scheme are in the following paper. While, the results in paper used a Chainer implementation of the system, the provided code is in PyTorch (translated from Chainer).

Please cite the paper for the baseline system without multimodal attention:

    @inproceedings{alamri@DSTC7,
            title={Audio Visual Scene-aware dialog (AVSD) Track for Natural Language Generation in DSTC7},
            author={Huda Alamri and Chiori Hori and Tim K. Marks and Dhruv Batra and Devi Parikh},
            booktitle={DSTC7 at AAAI2019 Workshop},
            year={2018}
    }

Please cite the paper for attentional multimoda fusion. https://arxiv.org/abs/1806.08409

  @article{hori2018end,
    title={End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features},
    author={Hori, Chiori and Alamri, Huda and Wang, Jue and Winchern, Gordon and Hori, Takaaki and Cherian, Anoop and Marks, Tim K and Cartillier, Vincent and Lopes, Raphael Gontijo and Das, Abhishek and others},
    journal={arXiv preprint arXiv:1806.08409},
    year={2018}
  }

Required packages

python 2.7
pytorch 0.4.1
numpy
six
java 1.8.0 (for coco-evaluation tools)

DSTC7 AVSD track data links: Pulicly available from April. 1st, 2019

https://github.com/hudaAlamri/DSTC7-Audio-Visual-Scene-Aware-Dialog-AVSD-Challenge

How to run the code:

Obtain the dialog data files (.json) and store them in data/.
- train_set4DSTC7-AVSD.json (official training set)
- valid_set4DSTC7-AVSD.json (official validation set)
- test_set4DSTC7-AVSD.json (official test set. this file does not include groundtruth at this moment)
- test_set.json (prototype test set used for tentative evaluation)
Make directory data/charades_features and extract files under the directory from downloaded feature packages:
- i3d_rgb.tgz, i3d_flow.tgz, vggish.tgz (train and validation sets)
- i3d_rgb_testset.tgz, i3d_flow_testset.tgz, vggish_testset.tgz (official test sets)
Run run_i3d.sh to train and test the network, that uses i3d_rgb and i3d_flow features. (run_i3d+vggish.sh is another example that further uses vggish features)
Model files and generated sentences will be stored in your experiment related folder under exp/, where result_test_set_b5_p1.0.json includes the generated sentences for the test set.
You can see the evaluation result result_test_set_b5_p1.0.eval in the folder. Note that the test set is not the official one.
- if you generate sentences for the official test set, you can run the script as
```
run_i3d.sh --stage 3 --test-set data/test_set4DSTC7-AVSD.json --fea-file "<FeaType>_testset/<ImageID>.npy"
```

Evaluatoin:

AVSD@DSTC7

https://drive.google.com/open?id=1nz9Pu9YIfuZHzowhASXERajRXqE6DBQx

Result:

The following results were obtained for the "prototype" test set by using run_i3d.sh.

METRIC	RESULT
Bleu_1	0.273
Bleu_2	0.173
Bleu_3	0.118
Bleu_4	0.084
METEOR	0.117
ROUGE_L	0.291
CIDEr	0.766

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
code		code
data		data
utils		utils
LICENSE		LICENSE
README.md		README.md
path.sh		path.sh
run_i3d+vggish.sh		run_i3d+vggish.sh
run_i3d.sh		run_i3d.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

data

data

utils

utils

LICENSE

LICENSE

README.md

README.md

path.sh

path.sh

run_i3d+vggish.sh

run_i3d+vggish.sh

run_i3d.sh

run_i3d.sh

Repository files navigation

Audio Visual Scene-aware Dialog (AVSD) example for DSTC7

Required packages

DSTC7 AVSD track data links: Pulicly available from April. 1st, 2019

How to run the code:

Evaluatoin:

Result:

About

Releases

Packages

Contributors 2

Languages

License

dialogtekgeek/AudioVisualSceneAwareDialog

Folders and files

Latest commit

History

Repository files navigation

Audio Visual Scene-aware Dialog (AVSD) example for DSTC7

Required packages

DSTC7 AVSD track data links: Pulicly available from April. 1st, 2019

How to run the code:

Evaluatoin:

Result:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages