Skip to content
@LanguageMachines

Language Machines

NLP Research group at Centre for Language Studies, Radboud University Nijmegen

Popular repositories

  1. frog frog Public

    Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

    C++ 73 11

  2. ucto ucto Public

    Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use…

    C++ 60 13

  3. PICCL PICCL Public

    A set of workflows for corpus building through OCR, post-correction and normalisation

    Python 46 6

  4. timbl timbl Public

    TiMBL implements several memory-based learning algorithms.

    C++ 45 9

  5. LuigiNLP LuigiNLP Public

    A workflow system for Natural Language Processing.

    Python 21 4

  6. libfolia libfolia Public

    FoLiA library for C++

    C++ 14 7

Repositories

Showing 10 of 53 repositories
  • ucto Public

    Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …

    C++ 60 GPL-3.0 13 13 0 Updated Apr 27, 2024
  • libfolia Public

    FoLiA library for C++

    C++ 14 GPL-3.0 7 5 0 Updated Apr 27, 2024
  • ticcutils Public

    Ticcutils, a generic utility library shared by our software.

    C++ 6 GPL-3.0 8 1 0 Updated Apr 27, 2024
  • foliautils Public

    Command-line utilities for working with the Format for Linguistic Annotation (FoLiA), powered by libfolia (C++), written by Ko van der Sloot (CLST, Radboud University)

    C++ 4 GPL-3.0 3 12 0 Updated Apr 26, 2024
  • frog Public

    Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

    C++ 73 GPL-3.0 11 13 (1 issue needs help) 0 Updated Apr 26, 2024
  • uctodata Public

    Datafiles for the tokenizer ucto.

    Shell 9 GPL-3.0 5 3 0 Updated Apr 26, 2024
  • foliatest Public

    Test suite for libfolia

    C++ 0 GPL-3.0 1 0 0 Updated Apr 26, 2024
  • frogtests Public

    Unit tests for Frog

    Lex 0 0 1 0 Updated Apr 25, 2024
  • ticcltools Public

    Tools for TICCL

    C++ 13 GPL-3.0 3 17 0 Updated Apr 23, 2024
  • timblserver Public

    TiMBL implements several memory-based learning algorithms. This is the server part.

    C++ 3 GPL-3.0 0 0 0 Updated Apr 18, 2024

Top languages

Loading…

Most used topics

Loading…