[EMNLP 2022] Discovering Language-neutral Sub-networks in Multilingual Language Models.
-
Updated
Apr 1, 2024 - Python
[EMNLP 2022] Discovering Language-neutral Sub-networks in Multilingual Language Models.
Align Parallel Sentence of 104 Languages with the help of mBERT and LaBSE
Bengali Misogyny Identification with Deep Learning and LIME.
An observatory of anglicism usage in the Spanish press
Fine tuned BERT, mBERT and XLMRoBERTa for Abusive Comments Detection in Telugu, Code-Mixed Telugu and Telugu-English.
A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language
Collection of scripts used to create SRL datasets for Galician and Spanish using a verbal indexing method, as well as fine-tuned BERT and XLM-R models for SRL on each language
GPT 3.5 FineTuning
Multilingual hate speech detection for German, Italian and Spanish Social Media Posts #machine learning #classifier
This is a project proposal to implement Yan et al.'s (2020) mBERT-Unaligned for cross-lingual RDs with Japanese, German and Italian untranslatable terms
This is a project proposal to implement Yan et al.'s (2020) mBERT-Unaligned for cross-lingual RDs with Japanese, German and Italian untranslatable terms
Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Slovenian Definition Extraction
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
mBERT and XLM-R for encodeing of Scandinavian languages
Zero-shot and Translation Experiments on XQuAD, MLQA and TyDiQA
By using the hypothesis of historical linguistics, we found a way to improve the performance of multilingual transformers with limited amount of data
HASOC2021: Subtask 2 a) Codemix Challenge; Contains baselines and hierarchical approach that extracts the relevant context useful for classification of hostile tweets on English-Hindi code-mix data obtained from twitter.
ICEBERT: Interlingual-Clusters Enhanced BERT. A BERT-like model trained on clusters of monolingual subwords.
Add a description, image, and links to the mbert topic page so that developers can more easily learn about it.
To associate your repository with the mbert topic, visit your repo's landing page and select "manage topics."