Awesome datasets for Bangla language computing.
-
Updated
Mar 7, 2022 - Python
Awesome datasets for Bangla language computing.
Nirmol is an open-source dataset and API for detecting Bangla slang words. Detect offensive/bad/slang words in Bangla/Bengali/Banglish sentences. A helpful API and dataset for developers and researchers.
Bangla news classification and generation
Zilla-64: A Bangla Handwritten Word Dataset Of 64 Districts Name of Bangladesh and Recognition Using Holistic Approach
Different bangla datasets for sentiment analysis on bangla text
The default auto correct dictionary added in avro Bangla keyboard doesn't contain enough word. So, this is my approach to enrich the dictionary. This file contains the correct spelling of commonly used Bangla words.
A collection of Bangla newspaper and blog crawlers. Can be used to mine bangla text data for Natural Language Processing tasks.
Bangla dataset for Opinion Mining
A Bangla license plates dataset (synthetic), generated with a mixture of deep learning and image processing. The labels are in darknet yolo format. [.txt, .data, .names]
Scrape 4000+ Bangla Song Lyrics
"WBSUBNdb_text: Bangla handwritten text document dataset" is a Bangla text dataset containing 1383 offline handwritten text documents contributed by 190 writers. The dataset is composed of both simple and compound characters.
Bangla Q&A dataset that contains questions, answer and paragraphs to train your model
Implementation of the paper 'Towards Full page Offline Bangla Handwritten Text Recognition using Image-to-Sequence Architecture'. For details, please read the README section.
Handwritten Bangla Character Classification using ResNet-34 trained using BanglaLekha Dataset. System has been implemented in PyTorch. For details, see the README file.
In this project, we have built a database of Bangla Handwritten Letters which contains handwritten images of 84 Bangla letters (10 numerals, 11 vowels, 39 consonants, 24 compound letters). We also investigated some of the existing Bangla character recognition models and found that these models have lower accuracy when the database contains some …
Noise Identification, Noise reduction, and Sentiment Analysis on Bangla Noisy Texts
Bangla word Level / Continuous Sign language Datasets
Bengali/Bangla Fake Review Detection Dataset
Add a description, image, and links to the bangla-dataset topic page so that developers can more easily learn about it.
To associate your repository with the bangla-dataset topic, visit your repo's landing page and select "manage topics."