The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
-
Updated
May 27, 2024 - Rust
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
Natural language detection library for .NET, suitable for long and short text alike
Click below to checkout the website
GlotLID: Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
ISO 639 and IETF Language Code Lookup Tool
Python Binding for Rust WhatLang, a language detection library
🕷️ The pipeline for the OSCAR corpus
This curated collection brings together a dataset of common Swahili stopwords gathered from various sources on the internet. Stopwords are words that are frequently used in a language but typically don't contribute significant meaning to a text.
Code for the live site for LangSonic, a multilingual speech classification CNN.
Plain swahili dastaset. Public sourced from public repositories
Language classifier based on BERT to classify Aviation Safety Reporting System (ASRS) narratives into categories that were clustered by using language embedding similarity.
Language Identification Techniques and Analysis (5 encoding X 17 models)
Classifier that identifies Greek text as Cypriot Greek or Standard Modern Greek
Language Identifier model : takes a sentence as input and predict its language.
Performing various NLP tasks with different Python libraries
Streamlit web app
Add a description, image, and links to the language-classification topic page so that developers can more easily learn about it.
To associate your repository with the language-classification topic, visit your repo's landing page and select "manage topics."