What is Learned in Visually Grounded Neural Syntax Acquisition

This is the code repository for the paper: "What is Learned in Visually Grounded Neural Syntax Acquisition", Noriyuki Kojima, Hadar Averbuch-Elor, Alexander Rush and Yoav Artzi (ACL 2020, Short Paper).

About

paper| talk

Visual features are a promising signal for learning bootstrap textual models. However, blackbox learning models make it difficult to isolate the specific contribution of visual components. In this analysis, we consider the case study of the Visually Grounded Neural Syntax Learner (Shi et al., 2019), a recent approach for learning syntax from a visual training signal. By constructing simplified versions of the model, we isolate the core factors that yield the model’s strong performance. Contrary to what the model might be capable of learning, we find significantly less expressive versions produce similar predictions and perform just as well, or even better. We also find that a simple lexical signal of noun concreteness plays the main role in the model’s predictions as opposed to more complex syntactic reasoning.

Codebase

Requirement: software

Python Virtual Env Setup: All code is implemented in Python. We recommend using virtual environment for installing these python packages.

VERT_ENV=vgnsl_analysis

# With virtualenv
pip install virtualenv
virtualenv $VERT_ENV
source $VERT_ENV/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

# With Anaconda virtual environment
conda update --all
conda create --name $VERT_ENV python=3.5
conda activate $VERT_ENV
pip install --upgrade pip
pip install -r requirements.txt

Requirement: data

Follow the instruction in https://github.com/ExplorerFreda/VGNSL (Data Preparation section) to download all the mscoco data under data/mscoco directory.

Test trained models

Please refer outputs/README.md to download trained models

cd src
# calculate F1 score
python test.py --candidate path_to_checkpoint --splits test

# calculate F1 score and output prediction to a text file
python test.py --candidate path_to_checkpoint --splits test --record_trees

Evaluation on catefory-wise recalls

Please download category annotation from the link and put them under data/mscoco.

# calculate F1 score and catefory-wise recalls
python test.py --candidate path_to_checkpoint --splits test --ctg_eval

Train your own models

#  train 1D embeddings with WS score function and Mean combine function
python train.py --log_step 20 --bottleneck_dim 1 --logger_name ../outputs/1-ws-mean --score_fn ws --combine_fn mean

#  train 2D embeddings with WS score function and Mean combine function (+HI)
python train.py --log_step 20 --bottleneck_dim 2 --logger_name ../outputs/2-ws-mean --score_fn ws --combine_fn mean --lambda_hi 20

#  train 2D embeddings with WS score function and Mean combine function (+HI+FastText)
python train.py --log_step 20 --bottleneck_dim 2 --logger_name ../outputs/hi-fasttext-2-ws-mean --score_fn ws --combine_fn mean --lambda_hi 20 --init_embeddings_key fasttext --init_embeddings_type partial-fixed

#  train 1D embeddings with Mean Hi score function and Max combine function (+HI+FastText-IN)
python train.py --log_step 20 --bottleneck_dim 1 --logger_name ../outputs/hi-fasttext-noimgnorm-1-meanhi-max --score_fn mean_hi --combine_fn max --lambda_hi 20 --init_embeddings_key fasttext --no_imgnorm

License

MIT

Citing

If you find this codebase and models useful in your research, please consider citing the following paper:

@InProceedings{Kojima2020:vgnsl,
    title = "What is Learned in Visually Grounded Neural Syntax Acquisition",
    author = "Noriyuki Kojima and Hadar Averbuch-Elor and Alexander Rush and Yoav Artzi",
    booktitle = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",
    month = "July",
    year = "2020",
    publisher = "Association for Computational Linguistics",
}

Ackowledegement

We would like to thank Freda for making their code (the code in this repo is largely borrowed from the original VGNSL implementation) public and responding promptly to our inquiry on Visually Grounded Neural Syntax Acquisition (Shi et al., ACL2019).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

miscs

miscs

outputs

outputs

src

src

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

What is Learned in Visually Grounded Neural Syntax Acquisition

About

Codebase

Contents

Requirement: software

Requirement: data

Test trained models

Evaluation on catefory-wise recalls

Train your own models

License

Citing

Ackowledegement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
miscs		miscs
outputs		outputs
src		src
README.md		README.md
requirements.txt		requirements.txt

lil-lab/vgnsl_analysis

Folders and files

Latest commit

History

Repository files navigation

What is Learned in Visually Grounded Neural Syntax Acquisition

About

Codebase

Contents

Requirement: software

Requirement: data

Test trained models

Evaluation on catefory-wise recalls

Train your own models

License

Citing

Ackowledegement

About

Resources

Stars

Watchers

Forks

Languages