Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 615 Bytes

README.md

File metadata and controls

9 lines (6 loc) · 615 Bytes

In order to use the OpenSubtitles dataset, you must first download and unpack the archive in this folder. The program will automatically look at every subfolders here. Train with this dataset using ./main.py --corpus opensubs.

Download english corpus directly here: http://opus.lingfil.uu.se/download.php?f=OpenSubtitles/en.tar.gz

All details on the corpus here: http://opus.lingfil.uu.se/OpenSubtitles.php

Note that even if that has not been tested, the program should be compatible with other languages as well. Just download the subtitles from the language you want from the OpenSubtitles database website.