Tool for translating a corpus file from one language to another.
-
Updated
Dec 8, 2022 - Python
Tool for translating a corpus file from one language to another.
Exploring and visualizing CONULLU files in Python
Analysing different text representations for genre identification. I parse CONLL-u files and extract various representations of a text (running text, lemmas, part-of-speech), then train a Fasttext model on each to see which representation is the most beneficial for the genre identification task.
Count Bigram frequency in a conllu format corpus
A tool for validating English CoNLL-U data files.
GitHub repository for Arc-Eager Transition-Based Parser
Small bilar packages
Repository for the paper "Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities"
A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.
Toolkit that simplifies corpus processing
ACoLi CoNLL libraries: Several tools for processing, manipulating and transforming TSV formats (CoNLL-RDF, CoNLL-Merge, CQP4RDF)
A package for manipulating Universal Dependencies trees
Simple script to parse text with spaCy and print the output in CoNLL-U format.
A number of command-line tools for working with FoLiA (Format for Linguistic Annotation). Includes validators, converters, visualisers, and more.
End-to-end integration of HuggingFace's models for sequence labeling.
Add a description, image, and links to the conllu topic page so that developers can more easily learn about it.
To associate your repository with the conllu topic, visit your repo's landing page and select "manage topics."