StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

[News - 4 March 2024] Please use the latest version of the dataset from HuggingFace. You can download the dataset from here.
[News - 4 Feb 2024] We correct some typos in Code/template.py for dataset generation. We re-generate the dataset and update the dataset in the Dataset folder. Please download the latest version of the dataset.

Quick Links

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

Description

Here are the dataset and codes for the AAAI 2022 paper "StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts".

Our Contributions

A new benchmark for multi-hop spatial reasoning in texts.
A memory augmented neural network for multi-hop spatial reasoning.

Usage from HuggingFace Dataset

You can use the StepGame dataset from HuggingFace. Here is an example of how to load the StepGame dataset from the HuggingFace Dataset.

from datasets import load_dataset
dataset = load_dataset("michaelszx/StepGame")
print(dataset)

You will get the results as follows:

DatasetDict({
    train: Dataset({
        features: ['story', 'question', 'label', 'k_hop'],
        num_rows: 50000
    })
    validation: Dataset({
        features: ['story', 'question', 'label', 'k_hop'],
        num_rows: 5000
    })
    test: Dataset({
        features: ['story', 'question', 'label', 'k_hop'],
        num_rows: 100000
    })
})

StepGame Dataset

Version

We offer two variations of our StepGame dataset, both of which are identical in content except for the size of samples.

TrainVersion: The "TrainVersion" is utilised in the baseline models presented in our paper. The models are trained on clean samples with varying levels of noise (k=1,2,3,4,5), where each value of k has 10,000 and 1,000 samples for training and validation, respectively. Thus, the total number of training samples is 50,000. During the testing phase, noise samples with k=1,2,3,4,5,6,7,8,9,10 are used. All of these samples are consolidated in the "TrainVersion" folder.
CompleteVersion: The "CompleteVersion" is an extension of the "TrainVersion" that we have provided to enable future research on spatial reasoning over texts. This extended version includes more samples and contains both clean and noise data. In this version, each part (clean and noise) consists of 30,000, 1,000, and 30,000 samples for training, validation, and testing, respectively, for values of k ranging from 1 to 10. All of these samples are consolidated in the "CompleteVersion" folder.

Generate more samples

You can produce additional samples by executing the following command:

cd Code
python parameterized_step_game_8relation.py --seed 123

Bugs or questions?

If you have any inquiries pertaining to the code or the paper, please do not hesitate to contact Zhengxiang (zhengxiang.shi.19@ucl.ac.uk). In case you encounter any issues while utilizing the code or wish to report a bug, you may open an issue. We kindly request that you provide specific details regarding the problem so that we can offer prompt and efficient assistance.

Citation

@inproceedings{stepGame2022shi,
title={StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts},
author={Shi, Zhengxiang and Zhang, Qiang and Lipani, Aldo},
volume={36},
url={https://ojs.aaai.org/index.php/AAAI/article/view/21383},
DOI={10.1609/aaai.v36i10.21383}, 
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2022},
month={Jun.},
pages={11321-11329}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Code		Code
Dataset		Dataset
Figure		Figure
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code

Code

Dataset

Dataset

Figure

Figure

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

Quick Links

Description

Our Contributions

Usage from HuggingFace Dataset

StepGame Dataset

Version

Generate more samples

Bugs or questions?

Citation

About

Releases

Packages

Languages

License

ZhengxiangShi/StepGame

Folders and files

Latest commit

History

Repository files navigation

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

Quick Links

Description

Our Contributions

Usage from HuggingFace Dataset

StepGame Dataset

Version

Generate more samples

Bugs or questions?

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages