Chinese Extractive Question Answering

This project aims to perform extractive question answering in Chinese.

Prerequisites

Python 3.10.12

Repository Structure

download.sh: Script to install Python dependencies and download necessary files.
requirements.txt: List of Python packages required for this project.
download.py: Python script to download necessary files from Google Drive.
run.sh: Bash script to run the inference code.
main.py: Python script containing the inference code.
train_src: Folder containing additional resources for training models on paragraph selection and span selection on your own.
report.pdf: Explanations of the data processing and model selection.

Setup

Step 1: Clone the Repository

git clone https://github.com/your_username/Chinese-Extractive-QA.git
cd Chinese-Extractive-QA

Step 2: Install Dependencies

Run the download.sh script to install the Python packages listed in requirements.txt and download the necessary files.

./download.sh

Run Inference

To run the inference code, execute the run.sh script with the following arguments:

${1}: Path to context.json
${2}: Path to test.json
${3}: Path to the output prediction file named prediction.csv

./run.sh /path/to/context.json /path/to/test.json /path/to/pred/prediction.csv

Note: Make sure to replace /path/to/context.json, /path/to/test.json, and /path/to/pred/prediction.csv with the actual paths to your files. To use the example code in this repo, you can run as below.

./run.sh ./ADL_HW1/datasets/context.json ./ADL_HW1/datasets/test.json ./prediction.csv

Train Your Own Model

In notebooks paragraph_selection.ipynb and span_selection.ipynb , you can fine-tune existing models or train from scratch. These codes are modified from these sources:

Paragraph Selection: Transformers model on a multiple choices dataset, like SWAG
Span Selection: Transformers model on a question-answering dataset, like SQuAD

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
images		images
train_src		train_src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
Report.pdf		Report.pdf
download.py		download.py
download.sh		download.sh
main.py		main.py
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

train_src

train_src

.DS_Store

.DS_Store

.gitignore

.gitignore

README.md

README.md

Report.pdf

Report.pdf

download.py

download.py

download.sh

download.sh

main.py

main.py

requirements.txt

requirements.txt

run.sh

run.sh

Repository files navigation

Chinese Extractive Question Answering

Prerequisites

Repository Structure

Setup

Step 1: Clone the Repository

Step 2: Install Dependencies

Run Inference

Train Your Own Model

License

About

Releases

Packages

Languages

Hannibal0420/Chinese-Extractive-Question-Answering

Folders and files

Latest commit

History

Repository files navigation

Chinese Extractive Question Answering

Prerequisites

Repository Structure

Setup

Step 1: Clone the Repository

Step 2: Install Dependencies

Run Inference

Train Your Own Model

License

About

Resources

Stars

Watchers

Forks

Languages