Skip to content
knoxa edited this page Aug 3, 2017 · 8 revisions

Below is a list of freely available text corpora, which may be useful for the development or testing of Baleen. The list is not exhaustive, and Baleen has not been developed to specifically work with any of the following so performance may vary.

A larger list of corpora, along with a list of other NLP related tools, is available on Stanford University's website: http://www-nlp.stanford.edu/links/statnlp.html#Corpora