You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to add an example of stripping accents to the documentation? (This is commonly needed for search applications.)
As I understand it, the right way to do this is to determine if each character IS_LETTER and in one of these Unicode blocks: LATIN_1_SUPPLEMENT, LATIN_EXTENDED_ADDITIONAL, LATIN_EXTENDED_A, LATIN_EXTENDED_B. If it is, then decompose it, remove any NON_SPACING_MARKs, and recompose.
I haven't been able to figure out if a character is a non-spacing mark or not.
The text was updated successfully, but these errors were encountered:
Would it be possible to add an example of stripping accents to the documentation? (This is commonly needed for search applications.)
As I understand it, the right way to do this is to determine if each character IS_LETTER and in one of these Unicode blocks: LATIN_1_SUPPLEMENT, LATIN_EXTENDED_ADDITIONAL, LATIN_EXTENDED_A, LATIN_EXTENDED_B. If it is, then decompose it, remove any NON_SPACING_MARKs, and recompose.
I haven't been able to figure out if a character is a non-spacing mark or not.
The text was updated successfully, but these errors were encountered: