Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

_.size to return symbol count of strings. #2302

Open
jdalton opened this issue Sep 16, 2015 · 4 comments
Open

_.size to return symbol count of strings. #2302

jdalton opened this issue Sep 16, 2015 · 4 comments

Comments

@jdalton
Copy link
Contributor

jdalton commented Sep 16, 2015

Continuing the discussion, at the moment:

_.size('馃挄')
// => 2

Since str.length is trivial I think it --may-- be handy if _.size returned the number of symbols in a string instead.

_.size('馃挄')
// => 1
@michaelficarra
Copy link
Collaborator

I'm not sure what you're looking to count with this function. I recently read about combining characters and character width, and a few examples are pretty convincing. It's impossible to count the number of symbols, summing character widths is pointless, number of code points depends on the unicode normalisation form, byte count is derived from length, other properties depend on the version of Unicode the browser is using, ...

@jdalton
Copy link
Contributor Author

jdalton commented Sep 18, 2015

I'm not sure what you're looking to count with this function.

Based on my example, I imagine:

_.size(string) === _.toArray(string).length;

It's impossible to count the number of symbols, summing character widths is pointless, number of code points depends on the unicode normalisation form, byte count is derived from length, other properties depend on the version of Unicode the browser is using, ...

Unless you want to truncate, or produce some slice/substring.

@jdalton
Copy link
Contributor Author

jdalton commented Sep 19, 2015

@michaelficarra on a related note following up on your link I added

added support for regional indicator symbols

[..."馃嚭馃嚫"] // => ["馃嚭", "馃嚫"]
_.toArray("馃嚭馃嚫") // => ["馃嚭馃嚫"]

added support for zero-width-joiners

and added support for variation selector characters:

and added support for unicode modifiers to lodash methods:

@mkxml
Copy link

mkxml commented Sep 23, 2015

馃憤

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants