Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
-
Updated
May 18, 2024 - Python
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
Text Processing & Segmentation Framework
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Split text into chars, words, or sentences from the command line.
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Accelerated deep learning R&D
Automatic Manga Translator
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Text segmentation into separate words using a simple unigram model and the Viterbi algorithm
Java implementation of UAX#29 text segmentation algorithm
Repo for the paper "Grounded Complex Task Segmentation for Conversational Assistants" presented at SIGDIAL 2023
"WBSUBNdb_text: Bangla handwritten text document dataset" is a Bangla text dataset containing 1383 offline handwritten text documents contributed by 190 writers. The dataset is composed of both simple and compound characters.
The work that was part of my Master's Thesis Project spring 2023
Tajik text segmentation algorithms
Fast SymSpell written in c++ and exposes to python via pybind11
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).
Perl wrapper for CppJieba (Chinese text segmentation)
Add a description, image, and links to the text-segmentation topic page so that developers can more easily learn about it.
To associate your repository with the text-segmentation topic, visit your repo's landing page and select "manage topics."