Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticSearch needs configuration settings for analyzers #621

Open
SacNaturalFoods opened this issue Aug 1, 2012 · 5 comments
Open

ElasticSearch needs configuration settings for analyzers #621

SacNaturalFoods opened this issue Aug 1, 2012 · 5 comments

Comments

@SacNaturalFoods
Copy link

The default EdgeNgram analyzer tokenizer is "lowercase", which seems to throw out digits, so I'm screwed if I want to search zip codes or employee numbers with partial strings.

Setting the tokenizer to "edgeNGram" and filter to ["edgeNGram", "lowercase"] allows me to partially search both letters and numbers, but these settings are in haystack.backends.elasticsearch_backend. Shouldn't this be a haystack configuration setting, or am I doing something wrong?

@racedo
Copy link

racedo commented Nov 26, 2012

+1

@speedplane
Copy link

You are correct that elastic search settings are hard-coded into the backend. I was able to modify these settings by subclassing the elastic search backend and modifying DEFAULTSETTINGS in __init__. It would be nice if there was a getter/setter for the settings, but the subclassing solution works just fine. Note that you'll have to also modify settings.py to point to your new backend.

@racedo
Copy link

racedo commented Nov 29, 2012

I modify directly the haystack/backends/elasticsearch_backend.py to achieve the same. It's not only the DEFAULT_SETTINGS that need to be modified, it's hardcoded to use the snowball analyzer for anything that's not a NgramField or EdgeNgramField and there's no setting for the default analyzer either, I guess an ideal solution would be:

1 - Allow configuration for the settings
2 - Allow configuration for the mapping
3 - Allow configuration for the default analyzer

@speedplane
Copy link

Right, I ran into that issue too. In your subclass, you need to override the build_schema method:

    def build_schema(self, fields):
        # Convert all "snowball" analyzers into the DEFAULT_ANALYZER analyzer set in __init__
        content_field_name, mapping = \
            super(ElasticsearchSearchBackendSubclass, self).build_schema(fields)
        for field_name, field_mapping in mapping.items():
            if "analyzer" in field_mapping and field_mapping["analyzer"] == "snowball":
                field_mapping["analyzer"] = self.DEFAULT_ANALYZER
        return content_field_name, mapping

Of course, a real solution with getter and setters would be preferable.

@bitcity
Copy link

bitcity commented Jan 19, 2015

👍 on ability to override backend settings for analyzers without monkey-patching *_backend.py files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants