conllu

Star

Here are 21 public repositories matching this topic...

MinionAttack / corpus-translator

Star

Tool for translating a corpus file from one language to another.

nlp translation conllu huggingface

Updated Dec 8, 2022
Python

ArbelTepper / NLP-IAHLT_project

Star

Exploring and visualizing CONULLU files in Python

conllu hebrew-nlp

Updated Dec 18, 2023
Jupyter Notebook

TajaKuzman / Text-Representations-in-FastText

Star

Analysing different text representations for genre identification. I parse CONLL-u files and extract various representations of a text (running text, lemmas, part-of-speech), then train a Fasttext model on each to see which representation is the most beneficial for the genre identification task.

text-classification fasttext language-processing conllu genre-identification feature-analysis

Updated Aug 18, 2022
Jupyter Notebook

stefanrer / CountBigramFreqInConlluCorpus

Star

Count Bigram frequency in a conllu format corpus

python frequency json dictionary python3 bigrams conllu unigrams tscore bigram-frequency unigram-frequency

Updated Dec 23, 2023
Python

rhdunn / conllu-en-validator

Star

A tool for validating English CoNLL-U data files.

universal-dependencies conllu

Updated Dec 4, 2023
Python

Nahid01752 / Arc-eager-parser

Star

GitHub repository for Arc-Eager Transition-Based Parser

python parsing numpy perceptron transition-based-parser conllu arc-eager

Updated Mar 23, 2023
Python

arthurdjn / udpos

Star

Universal Dependencies datasets preprocess and autodownloads.

converter pytorch dataset txt conllu udpos

Updated Mar 15, 2020
Python

fergusq / bils

Star

Small bilar packages

nlp json utilities telegram-bot irc-bot http-server bilar conllu

Updated Jun 17, 2018

MuhammadYaseenKhan / CoNLL-U_Parser

Star

An small Python script that converts a .conllu file into a tab seprated view (tsv) file.

nlp tsv parser csv pandas python3 pos-tagging conll-u anaconda3 conllu

Updated Dec 4, 2019
Jupyter Notebook

SapienzaNLP / exploring-srl

Star

Repository for the paper "Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities"

nlp data natural-language-processing acl dataset srl semantic-role-labeling conllu acl2023

Updated Sep 20, 2023

TajaKuzman / Parlamint-translation

Star

A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.

machine-translation word-alignment conllu dataset-preparation parlamint