Skip to content

Utilities for working with CoNLL-U

Notifications You must be signed in to change notification settings

danieldk/conllu-utils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoNLL-U Utilities

Introduction

This is a set of utilities to process files in the CoNLL-U format. The conllu command provides the following subcommands:

  • accuracy: compute the accuracy of a system based on two treebanks
  • cleanup: normalize unicode and replace unicode punctuation
  • compare: compare two treebanks on one or more layers
  • from-text: convert tokenized text files to CoNLL-U.
  • merge: merge CoNLL-U files
  • partition: partition a CoNLL-U file in N files.
  • shuffle: shuffle the sentences in a CoNLL-U file.
  • to-text: convert CoNLL-U to tokenized plain text.

Usage

Executing a subcommand gives usage information when --help is given as an argument.

About

Utilities for working with CoNLL-U

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published