Skip to content

Latest commit

 

History

History

opensubs

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

In order to use the OpenSubtitles dataset, you must first download and unpack the archive in this folder. The program will automatically look at every subfolders here. Train with this dataset using ./main.py --corpus opensubs.

Download english corpus directly here: http://opus.lingfil.uu.se/download.php?f=OpenSubtitles/en.tar.gz

All details on the corpus here: http://opus.lingfil.uu.se/OpenSubtitles.php

Note that even if that has not been tested, the program should be compatible with other languages as well. Just download the subtitles from the language you want from the OpenSubtitles database website.