A parser for annotated MuseScore 3 files.
-
Updated
May 23, 2024 - Python
A parser for annotated MuseScore 3 files.
The canonical resources to build the backend for a corpus/repository management framework for Crow, the Corpus and Repository of Writing
AutoCorpus is a tool backed by a large language model (LLM) for automatically generating corpus files for fuzzing.
Create a corpus for fine-tuning an OpenAI model
Katya or The Liberated Corpus a text corpus that allows you to request and scrape any web resource!
A full-text article retrieval pipeline for biomedical literature.
Information Retrieval Lab
Bitextor generates translation memories from multilingual websites
Natively log WeeChat channel and private messages, CTCP, and notices, in the driftwood standard. Written in Python.
Scrimshaw parses IRC logs stored in the driftwood format for quotes attributable to a given user. Written in Rust.
Generate pseudo-English sentences for research in semantic composition
A set of corpus-based sampling & analysis M4L devices
A prototype for generating language in a grounded simulation of a simple hunter-gatherer world
A clean Fusha Arabic tagged corpus.
A corpus of Ukrainian Twitter texts + instructions for downloading and filtering texts.
Augmentation scripts for the bAbI Dialog Tasks dataset
golden arabic corpus build for test Assem's arabicstemmer and other arabic stemmers
A corpus builder for evaluation of plagiarism detection tools
Add a description, image, and links to the corpus-generator topic page so that developers can more easily learn about it.
To associate your repository with the corpus-generator topic, visit your repo's landing page and select "manage topics."