Apostrophe/Quote Prediction using Transformers

This is an implementation of Transformers and LSTM to solve the problem of quotation prediction. The trained model is able to guess in which positions a single or double quote should be put. LSTM model is trained from scratch and character-based. Addition to LSTM, BERT and T5 models are used. According to the experiments, T5 seems to outperform other models. Refer to the report for implementation details and results.

How to Run

Data Fetching/Preprocessing

python data.py --min-len 50 --max-len 500 --silicone --wiki --output-path ./dataset/

optional arguments:
  -h, --help            show this help message and exit
  --min-len MIN_LEN     Discard sentences with length less than this value
  --max-len MAX_LEN     Discard sentences with length greater than this value
  --silicone            Include informal datasets
  --wiki                Include formal dataset
  --output-path OUTPUT_PATH
                        Output path for gathered data

Training/Evaluation

Please check 'config.py' for all configuration options. They should be self-explanatory. Also, there are example configurations for all three of models in configs/ folder.

For example, to train a BERT model:

python main.py --config configs/bert.yml

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.vscode		.vscode
configs		configs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data.py		data.py
lstm.py		lstm.py
main.py		main.py
report.pdf		report.pdf
requirements.txt		requirements.txt
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

configs

configs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

config.py

config.py

data.py

data.py

lstm.py

lstm.py

main.py

main.py

report.pdf

report.pdf

requirements.txt

requirements.txt

transformer.py

transformer.py

Repository files navigation

Apostrophe/Quote Prediction using Transformers

How to Run

Data Fetching/Preprocessing

Training/Evaluation

About

Releases

Packages

Languages

License

tugrulhkarabulut/apostrophe-quote-prediction

Folders and files

Latest commit

History

Repository files navigation

Apostrophe/Quote Prediction using Transformers

How to Run

Data Fetching/Preprocessing

Training/Evaluation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages