Taiwanese Hokkien Transliterator and Tokeniser
-
Updated
May 24, 2024 - JavaScript
Taiwanese Hokkien Transliterator and Tokeniser
Taiwanese Hokkien Transliterator and Tokeniser
Frames iOS: making native card payments simple
This project predicts MBTI personality types from users' recent 50 posts using NLP and ML techniques.
Open morphology for Finnish
Frames Android: making native card payments simple
It is an end-to-end text summarizer application, which uses Meta's BART model and is fine-tuned on the Samsung dataset.
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
TokenScript schema, specs and paper
A tiny utility that takes a string and decomoposes it to the letters of the Hungarian alphabet.
A Python 3 module that provides functions for splitting identifiers found in source code files.
A search engine is constructed to return customised recipes according to three sorting algorithms. Speed is improved by performing pre-processing and inverted index.
Built a complete search engine by creating an Inverted Index on the Wikipedia corpus ( of 2018 with size 72 GB). That gives you top search result related to given query words.
Add a description, image, and links to the tokenisation topic page so that developers can more easily learn about it.
To associate your repository with the tokenisation topic, visit your repo's landing page and select "manage topics."