-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ElasticSearch needs configuration settings for analyzers #621
Comments
+1 |
You are correct that elastic search settings are hard-coded into the backend. I was able to modify these settings by subclassing the elastic search backend and modifying |
I modify directly the haystack/backends/elasticsearch_backend.py to achieve the same. It's not only the DEFAULT_SETTINGS that need to be modified, it's hardcoded to use the snowball analyzer for anything that's not a NgramField or EdgeNgramField and there's no setting for the default analyzer either, I guess an ideal solution would be: 1 - Allow configuration for the settings |
Right, I ran into that issue too. In your subclass, you need to override the def build_schema(self, fields):
# Convert all "snowball" analyzers into the DEFAULT_ANALYZER analyzer set in __init__
content_field_name, mapping = \
super(ElasticsearchSearchBackendSubclass, self).build_schema(fields)
for field_name, field_mapping in mapping.items():
if "analyzer" in field_mapping and field_mapping["analyzer"] == "snowball":
field_mapping["analyzer"] = self.DEFAULT_ANALYZER
return content_field_name, mapping Of course, a real solution with getter and setters would be preferable. |
👍 on ability to override backend settings for analyzers without monkey-patching |
The default EdgeNgram analyzer tokenizer is "lowercase", which seems to throw out digits, so I'm screwed if I want to search zip codes or employee numbers with partial strings.
Setting the tokenizer to "edgeNGram" and filter to ["edgeNGram", "lowercase"] allows me to partially search both letters and numbers, but these settings are in haystack.backends.elasticsearch_backend. Shouldn't this be a haystack configuration setting, or am I doing something wrong?
The text was updated successfully, but these errors were encountered: