Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to index postgres' JSONField? #1389

Open
kristoff-it opened this issue Jun 28, 2016 · 3 comments
Open

How to index postgres' JSONField? #1389

kristoff-it opened this issue Jun 28, 2016 · 3 comments

Comments

@kristoff-it
Copy link

kristoff-it commented Jun 28, 2016

Hi, I'm not sure if this is something I failed to find/understand from the docs or if it is an entirely new feature.

Basically I have a project in which models have a 'custom property' system powered by Postgres' JSONField. This allows users to add a json document to the set of properties of each object.
I'd like to be able to search (via ES) those fields like they were any other property.

As an example:

# models.py
class MyModel(models.Model):
   name = models.CharField()
   attrs = postgres.fields.JSONField(default=dict)

# how I'd expect ES documents to look like:
{
   ...,
   "name": "banana",
   "attrs" : { "fruitColor": "yellow", "growsOnTrees": true}
},

{
   ...,
   "name": "samuel",
   "attrs" : { "friends": ["mark", "francesca", "luke"], "birthday": "10/10/1990"}
}

What type of SearchField should I use?
Thanks!

@acdha
Copy link
Contributor

acdha commented Jun 28, 2016

That's a relatively new feature which we have no special support for yet. If you were just using, say, tags or other simple field values you could simply have a multivalued field to put the data in but Haystack tends not to have support for things which aren't portable across search backends like ElasticSearch's nested objects.

You'll definitely need a custom prepare_ method (http://django-haystack.readthedocs.io/en/v2.4.1/searchindex_api.html#advanced-data-preparation) to package the data but the nesting ability of JSON is going to make this complicated.

Something like this should work for getting the data into either ElasticSearch or a Solr dynamic field:

def prepare(self, obj):
    prepared_data = super(…)
    # Add fields from your JSON data
    return prepared_data

(For something like your friends list, that'll work as long as you define the field as a multi-valued field)

Haystack doesn't attempt to validate field names in queries so you should be able to filter as long as you know the key value. What I think will be tricky – and probably best solved by getting the ES connection object (e.g. with Solr that's searchqueryset.query.backend.conn), which needs to become easier – would be more complex nested queries.

@caiocarrara
Copy link

Hi, is there any update about JSONField indexes? Is there any new good approach to accomplish that?

@keshavgoel21
Copy link

@kristoff-it @cacarrara Speaking for providing support for ElasticSearch, it can not happen as ES doesn't support dynamic mapping between documents. In case of django you can dynamically modify JSON between multiple records but mapping for ES needs to be consistent between documents(records). but ES supports Nested Field Datatypes which could be useful for dynamically increase same type type of field.
If you are using Elasticsearch you can customize haystack backend, you can take reference from here
http://www.stamkracht.com/blog/extending-haystacks-elasticsearch-backend/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants