Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search by unicode #50

Open
grimen opened this issue Jan 11, 2016 · 15 comments
Open

Search by unicode #50

grimen opened this issue Jan 11, 2016 · 15 comments

Comments

@grimen
Copy link
Contributor

grimen commented Jan 11, 2016

Was finding myself a few times trying to enter Unicode (Hex) value into the search field, but don't get any results. Would be great if this was possible to reverse-lookup icons like that, especially when debugging.

I am not sure how Algolia Search works with regards to field indexing, but could not see anything in the code that made me believe I could create a pull request - probably Algolia Dashboard thing?

Related: This cross-browser client full text search engine is pretty impressive performance wise, in case that would be a case: http://reyesr.github.io/fullproof/

@redox
Copy link
Contributor

redox commented Jan 11, 2016

Algolia handles emojis out-of-the-box actually :) so if you search for 🔍 Algolia automatically expand it to magnifying glass http://glyphsearch.com/?query=%F0%9F%94%8D

We could definitely add an extra attribute in every record with the Hex value to be able to retrieve them using like 1F50D (http://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%94%8D).

@grimen would be like to try a PR?

@grimen
Copy link
Contributor Author

grimen commented Jan 11, 2016

@redox The Hex Unicode value is already in the uploaded search index batch.json records, so I would assume only and index on that field is missing?

@redox
Copy link
Contributor

redox commented Jan 11, 2016

@redox The Hex Unicode value is already in the uploaded search index batch.json records, so I would assume only and index on that field is missing?

Oh yeah, so it's only about configuration :)

We just need to include unicode in the array of attributes to index (probably last in the array, to consider it less important than name & tags): (https://github.com/thomaspark/glyphsearch/blob/gh-pages/Gruntfile.js#L49)

index.setSettings({ 'attributesToIndex' : ["name", "tags", "unicode"], 'customRanking' : ["asc(name)"], 'queryType' : 'prefixAll' });

@thomaspark
Copy link
Owner

Hadn't considered the use case of searching by unicode hex for debugging, but seems easy enough to add. Would be happy to accept a PR on this.

@grimen
Copy link
Contributor Author

grimen commented Jan 11, 2016

@thomaspark I could fix it, but how do I do it? From Algolia docs it sounds like it is a dashboard thing. I cannot find anything in the code that says what is indexed or not, because the unicode hex is definitely already in the content unescaped (which it should be).

@redox
Copy link
Contributor

redox commented Jan 11, 2016

@grimen you just need to update the "attributesToIndex" array like I copy/pasted you -> the settings are applied through the API (no need of the dashboard).

@grimen
Copy link
Contributor Author

grimen commented Jan 11, 2016

@redox Oh sorry, I must have been blind today because I neither noticed your comment or the attributesToIndex line in the code. Simple fix in other words.

@thomaspark
Copy link
Owner

Copying question from the PR #65 to have discussion here:

@redox, is there a way to set fuzziness per attribute? I think it's preferable to have exact matching on hex searches.

@grimen
Copy link
Contributor Author

grimen commented Jan 25, 2016

@thomaspark Good point, that would be useful.

@grimen
Copy link
Contributor Author

grimen commented Feb 1, 2016

@redox Any input?

@redox
Copy link
Contributor

redox commented Feb 16, 2016

Super late on this; replied here

@thomaspark
Copy link
Owner

Thanks @redox! This is exactly what I was hoping for.

@thomaspark
Copy link
Owner

@redox, I ran into a couple of issues:

When I change the index settings programmatically, it isn't reflected in the web dashboard. In fact, it seems to reset whatever I had manually set in dashboard. Any idea why this might be the case? I'm using v2 of algoliasearch library.

I set disableTypoToleranceOnAttributes for unicode, and also tried allowTyposOnNumericTokens to false. While the matched set is different, it still includes fuzzy matches (e.g., for "f030", it also matches "f00a", "f001", "f03a", etc).

I was able to temporarily bypass these issues by setting minWordSizefor1Typo to 5 in the dashboard.

Any help would be appreciated!

@redox
Copy link
Contributor

redox commented Feb 25, 2016

That's weird @thomaspark :/ And right now the index is totally un-configured. In which branch do you work? Can I see the associated code? The one here looks fine.

@thomaspark
Copy link
Owner

It's pretty much unchanged from that, except for one line:

'disableTypoToleranceOnAttributes': ["unicode"],

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants