Skip to content

dainis-boumber/MLP-400-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLP-400-datasets

Contains various NLP datasets made from 400 papers by top 20 authors by citation in Machine Learning

MLPA-400 is a multiclass, mutilable authorship attribution problem. See it's own README.md and Weka for details

MLP-400AV is a new flexible dataset that uses the same data but for Authrship Verification. Because of it's extensive API, it can be adapted to almost any NLP task.

Total size of the datasets is over 18 milllion characters for MLPA-400 and almost roudble that for MLPA-400AV.

About

Containers various NLP datasets made from 400 papers by top 20 authors by citation in Machine Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages