Tools for converting Label Studio annotations into common dataset formats
-
Updated
May 23, 2024 - Python
Tools for converting Label Studio annotations into common dataset formats
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.
Text Tonsorium - a toolbox that automatically arranges NLP tools in workflows and enacts them with user's inputs
A simple iterator that reads conll and conllu files (https://universaldependencies.org/format.html) without keeping them in memory. It can iterate over words, sentences, or documents.
Extended CoNLL Utilities for Shallow Parsing
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
Tool to download CoNLL Shared Task classification tables and perform metrics on them. It also allows to perform metrics on local results and display outliers for a parser.
The official tool for transforming doccano format into common dataset formats.
R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Contrastive learning for multilingual complex named entity recognition. Bert + CRF model.
Named entity recognition for Clinical records.
Demo for Name Entity Recognition task with transformers
A collection of python scripts for generating random dependency trees.
SLI mappings for OMSTI dataset
SLI mappings for the Princeton Annotated Gloss Corpus dataset
SLI mappings for SemCor dataset
SLI mappings for Senseval and SemEval datasets
ACoLi CoNLL libraries: Several tools for processing, manipulating and transforming TSV formats (CoNLL-RDF, CoNLL-Merge, CQP4RDF)
Add a description, image, and links to the conll topic page so that developers can more easily learn about it.
To associate your repository with the conll topic, visit your repo's landing page and select "manage topics."