Assignment 2 for CS 11-731 Machine Translation course.
-
Updated
Nov 6, 2019 - TypeScript
Assignment 2 for CS 11-731 Machine Translation course.
[ACL 2021, Findings] Cognate Prediction Per Machine Translation
A natural language processing and machine learning project for a low resource langauge in Zambia.
A 16M LLM for POS tagging in African languages
Embedding Evaluation Data for South African Languages
Investigating transfer learning in low-resourced languages, specifically in a named entity recognition (NER) task (IJCNLP-AACL 2023). http://arxiv.org/abs/2309.05311
A Text-To-Speech android app to read out Gondi articles for the Gond tribal community
Italian hate speech detection using transformer.
A web application to test sentence-similarity models of the top 10 Indian Languages
IsiZulu News (articles and headlines) and Siswati News (headlines) Corpora - za-isizulu-siswati-news-2022
Scripts and files I used throughout my M.Sc. Voice Technology Thesis Project at Rijkuniversiteit Groningen - Campus Fryslân.
Finetuning BERT models on a powerset of different linguistic domains
Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus
Fine-tune LLM for early Middle English lemmatization with data from LAEME.
GlotSparse: Building Corpora in Under-Resourced Languages
Repo associated with the forthcoming paper 'Instruct-global: aligning language models to follow instructions in low-resource languages'. Instruct-global automates the process of generating instruction datasets in low-resource languages (LRLs).
Automating healthcare QA in a noisy multilingual low-resource setting
QuantHaLL: Quantifying Hallucination in machine translation for Low-resource Languages
Example dataset and prompt design of Korean Offensive language Machine Generation (K-OMG), published at IJCNLP-AACL 2023.
a repository containing the details of natural language inference dataset in Hindi
Add a description, image, and links to the low-resource-languages topic page so that developers can more easily learn about it.
To associate your repository with the low-resource-languages topic, visit your repo's landing page and select "manage topics."