You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to split Burmese Unicode characters in stringr::str_split() but not return the correct values.
str_split("စမ်းသပ်မှု", "")[[1]]
it returns:
[1] "စ" "မ်" "း" "သ" "ပ်" "မှု"
If I use buildin strsplit: strsplit("စမ်းသပ်မှု", "")[[1]] it returns character level:
[1] "စ" "မ" "်" "း" "သ" "ပ" "်" "မ" "ှ" "ု"
I found that str_split treat "" empty string as regex but stringr::str_split() does not return neither character nor syllable:
[1] "စမ်း" "သပ်" "မှု"
So, I don't think it is actually a feature like Issue:88
For further study, if possible, could someone guide me where this splitting is coming from? I found that other services like Google also use this incorrect splitting format. TIA.
The text was updated successfully, but these errors were encountered:
I am trying to split Burmese Unicode characters in stringr::str_split() but not return the correct values.
str_split("စမ်းသပ်မှု", "")[[1]]
it returns:
If I use buildin strsplit:
strsplit("စမ်းသပ်မှု", "")[[1]]
it returns character level:I found that str_split treat "" empty string as regex but stringr::str_split() does not return neither character nor syllable:
So, I don't think it is actually a feature like Issue:88
For further study, if possible, could someone guide me where this splitting is coming from? I found that other services like Google also use this incorrect splitting format. TIA.
The text was updated successfully, but these errors were encountered: