This project aims to generate titles for Chinese news using mT5-small model.
- Python 3.9
requirements.txt
: List of Python packages required for this project.download.sh
: Script download training data and a pre-trained model.download.py
: Script called bydownload.sh
to handle the downloading process.run.sh
: Bash script to run the inference code.inference.py
: Python script that performs inference using the pre-trained model.train_src
: Folder containing additional resources for training models on your own.report.pdf
: Explanations of the hyperparameter sweep and generation strategies.
git clone https://github.com/Hannibal0420/Chinese-News-Summarization.git
cd Chinese-News-Summarization
Install the required Python packages and inference model using the following command:
pip install -r requirements.txt
bash ./download.sh
To run the inference code, execute the run.sh
script with the following arguments:
${1}
: Path toinput.jsonl
${2}
: Path tooutput.jsonl
bash ./run.sh /path/to/input.jsonl /path/to/output.jsonl
Note: Make sure to replace /path/to/input.jsonl
and /path/to/output.jsonl
with the actual paths to your files. To use the example code in this repo, you can run as below.
./run.sh ./data/public.jsonl ./output.jsonl
In the train_src
folder, you can fine-tune existing models or train from scratch and track the process with Weights & Biases toolkits. These codes are modified from this source:
- Summarization: Transformers model on a extractive and abstractive summarization dataset, like CNN/DailyMail
This project is licensed under the MIT License - see the LICENSE.md file for details.