ElasticSearch needs configuration settings for analyzers #621

SacNaturalFoods · 2012-08-01T16:13:13Z

The default EdgeNgram analyzer tokenizer is "lowercase", which seems to throw out digits, so I'm screwed if I want to search zip codes or employee numbers with partial strings.

Setting the tokenizer to "edgeNGram" and filter to ["edgeNGram", "lowercase"] allows me to partially search both letters and numbers, but these settings are in haystack.backends.elasticsearch_backend. Shouldn't this be a haystack configuration setting, or am I doing something wrong?

racedo · 2012-11-26T17:15:49Z

+1

speedplane · 2012-11-28T16:36:34Z

You are correct that elastic search settings are hard-coded into the backend. I was able to modify these settings by subclassing the elastic search backend and modifying DEFAULTSETTINGS in __init__. It would be nice if there was a getter/setter for the settings, but the subclassing solution works just fine. Note that you'll have to also modify settings.py to point to your new backend.

racedo · 2012-11-29T00:36:23Z

I modify directly the haystack/backends/elasticsearch_backend.py to achieve the same. It's not only the DEFAULT_SETTINGS that need to be modified, it's hardcoded to use the snowball analyzer for anything that's not a NgramField or EdgeNgramField and there's no setting for the default analyzer either, I guess an ideal solution would be:

1 - Allow configuration for the settings
2 - Allow configuration for the mapping
3 - Allow configuration for the default analyzer

speedplane · 2012-11-29T02:53:43Z

Right, I ran into that issue too. In your subclass, you need to override the build_schema method:

    def build_schema(self, fields):
        # Convert all "snowball" analyzers into the DEFAULT_ANALYZER analyzer set in __init__
        content_field_name, mapping = \
            super(ElasticsearchSearchBackendSubclass, self).build_schema(fields)
        for field_name, field_mapping in mapping.items():
            if "analyzer" in field_mapping and field_mapping["analyzer"] == "snowball":
                field_mapping["analyzer"] = self.DEFAULT_ANALYZER
        return content_field_name, mapping

Of course, a real solution with getter and setters would be preferable.

…refs django-haystack#621, django-haystack#639, django-haystack#822)

bitcity · 2015-01-19T07:26:39Z

👍 on ability to override backend settings for analyzers without monkey-patching *_backend.py files.

speedplane mentioned this issue Dec 12, 2012

how to set analyzer to haystack with elasticsearch #639

Closed

saippuakauppias pushed a commit to ForkLab/django-haystack that referenced this issue Jul 12, 2013

add support for configurate field mapping for ElasticSearch backend (…

0a44460

…refs django-haystack#621, django-haystack#639, django-haystack#822)

bitcity mentioned this issue Oct 7, 2014

filter on EdgeNgramField with numbers - unexpected results #1082

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ElasticSearch needs configuration settings for analyzers #621

ElasticSearch needs configuration settings for analyzers #621

SacNaturalFoods commented Aug 1, 2012

racedo commented Nov 26, 2012

speedplane commented Nov 28, 2012

racedo commented Nov 29, 2012

speedplane commented Nov 29, 2012

bitcity commented Jan 19, 2015

ElasticSearch needs configuration settings for analyzers #621

ElasticSearch needs configuration settings for analyzers #621

Comments

SacNaturalFoods commented Aug 1, 2012

racedo commented Nov 26, 2012

speedplane commented Nov 28, 2012

racedo commented Nov 29, 2012

speedplane commented Nov 29, 2012

bitcity commented Jan 19, 2015