Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Additional Filter #64

Open
pnewell opened this issue Mar 14, 2018 · 3 comments
Open

Add Additional Filter #64

pnewell opened this issue Mar 14, 2018 · 3 comments

Comments

@pnewell
Copy link

pnewell commented Mar 14, 2018

I was wondering if there is some way to add additional filters (specifically non visible ASCII character (backspace, etc)?

@dzcpy
Copy link
Owner

dzcpy commented Jul 15, 2018

I'm going to work on it. What I'm thinking is to provide an option which you can use your own Unicode -> Latin mappings. Something like:

transliterate('こんにちわ、世界!\u0008', { charmap: { 8 /* \u0008 backspace */: '⌫' });

This is much better than using replace option, since it uses a regex to match and replace the entire source string, splitting the source string into multiple segments and then calling transliterate in a recursive ways to transliterate each segments. If using a Unicode number to Latin mapping it's much more efficient.

At the first step, it may only support simple Unicode -> Latin character map. But later on in v2 I'd implement a plugin system, so for some specific language which cannot be 1-1 mapped can implement their own logic, only when the source string contains the characters in that language. For example, Chinese, Japanese and Thai are not working properly without analyzing each whole words, instead of single characters. Then, since I'm not familiar with the transliteration rules from a lot of languages, other users might be able to contribute their own transliteration rules.

Ultimately, if I have time, I'd like to make it possible to transliterate from 1 language to another, e.g. from English to Russian, from Spanish to Japanese etc.. There's a tone of workload though and requires much professional knowledge.

How do you guys think about it?

@dzcpy
Copy link
Owner

dzcpy commented Jan 16, 2019

With v2 you can do it like this:

transliterate.setData({
  '\u0008': '⌫'
});
transliterate('こんにちわ、世界!\u0008');

@dzcpy
Copy link
Owner

dzcpy commented Jan 16, 2019

I'll close it for now. If you have more questions, please comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants