Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on non-indexed attributes #4404

Open
shohamazon opened this issue Jan 29, 2024 · 10 comments
Open

Clarification on non-indexed attributes #4404

shohamazon opened this issue Jan 29, 2024 · 10 comments
Assignees
Labels

Comments

@shohamazon
Copy link

Hello:)
I have been exploring the FT.CREATE command and I would like some clarifications regarding the non-indexed fields.

As mentioned in the redis documentation - If an attribute has NOINDEX and doesn't have SORTABLE, it will just be ignored by the index.

What does this mean? Because when I created a non-indexed field, I could find it when I ran FT.INFO.

127.0.0.1:6379> ft.create my_indx on hash schema title TEXT NOINDEX 
 OK
127.0.0.1:6379> ft.info my_indx
 1) index_name
 2) my_indx
 3) index_options
 4) (empty array)
 5) index_definition
 6) 1) key_type
    2) HASH
    3) prefixes
    4) 1) 
    5) default_score
    6) "1"
 7) attributes
 8) 1) 1) identifier
       2) title
       3) attribute
       4) title
       5) type
       6) TEXT
       7) WEIGHT
       8) "1"
       9) NOINDEX
...

Is the attribute really ignored as mentioned?

And can I create an attribute that has SORTABLE , UNF and NOINDEX ?
Thank you:)

@meiravgri
Copy link
Collaborator

Hi @shohamazon.
NOINDEX means that an inverted index won't be built for that specific field
When you search for documents based on this field, you'll get zero results:
for example:

127.0.0.1:6379> ft.create idx schema t text NOINDEX
OK
127.0.0.1:6379> hset doc1 t 1 n 2
(integer) 1
127.0.0.1:6379> ft.search idx "@t:1"
1) (integer) 0
(2.47s)

However, to perform operations like sorting, the fields need to be part of the scehma:

127.0.0.1:6379> ft.search idx * sortby t
1) (integer) 1
2) "doc1"
3) 1) "t"
   2) "1"
   3) "n"
   4) "2"
127.0.0.1:6379> ft.search idx * sortby n
(error) Property `n` not loaded nor in schema

This allows you to run queries on the data, using the NOINDEX field without the overhead of maintaining an index for it.
Defining the field as SORTABLE can provide faster access to its value during queries.

And yes, you can combine these 3 options.

Hope it helps :) 🇮🇱

@shohamazon
Copy link
Author

Thank you so much! That helped a lot.

So the benefit of having SORTABLE combined with NOINDEX is just for faster access? (which the same benefit for having SORTABLE in general if I am not mistaken).

So what is the difference between the example you gave and this one: ft.create idx schema t text NOINDEX SORTABLE

How's the field different in a way that it wont be ignored?

Thank you! I really appreciate it :) 🇮🇱

@meiravgri
Copy link
Collaborator

Yes, you got it right

When using SORTABLE the field won't be ignored entirely. The values of the field are stored in advance, ensuring accessing them during queries with minimal latency.

as mentioned in the documentation:

SORTABLE - NUMERIC, TAG, TEXT, or GEO attributes can have an optional SORTABLE argument. As the user sorts the results by the value of this attribute, the results are available with very low latency. Note that his adds memory overhead, so consider not declaring it on large text attributes. You can sort an attribute without the SORTABLE option, but the latency is not as good as with SORTABLE.

@shohamazon
Copy link
Author

Thank you, I hope it's ok to ask more questions:)

I created a new index and I am not sure why the behavior is like that.

127.0.0.1:6379> ft.create my_index schema t text sortable noindex
OK
127.0.0.1:6379> hset doc t 1 n 2
(integer) 2
127.0.0.1:6379> ft.search my_index *
1) (integer) 0
127.0.0.1:6379> ft.search my_index * sortby t
1) (integer) 0

Why adding the sortable caused this behavior? I would assume that this should have happened if I didn't add SORTABLE to the field (which works fine, like the example you provided).

@meiravgri
Copy link
Collaborator

Of course :)

Seems like you caught a bug. ⭐
Thanks for bringing this to our attention.

To better assist you, could you please provide more details about the specific use case you're working on?
This will help me suggest the most suitable configuration for your scenario.

@shohamazon
Copy link
Author

Oh I see now,
I don't have anything special, just wanted to understand better why some clients ban you from creating a field that is non-indexed but not storable.

Maybe it would be a good idea to change the documentation? I still find it a bit confusing, maybe a better explanation on what it means when a field is being ignored.

By the way, I noticed that the bug is only for a single field schema, for any different schema - I guess it is working fine.

Thank you so much !🙂

@meiravgri
Copy link
Collaborator

I agree that this behavior is odd and not well-documented.

Regarding the additional potential bug you mentioned, I assume you are relating to a case where the schema includes additional fields that don't have the NOINDEX attribute. Note that the default and correct behavior is to notify the module on every hash event (new entry, update, or delete). When running ft.search idx * you should get all the hashes in your database.
When you add a field that works as expected, and the hash contains both schema fields, the module will get notified properly.

127.0.0.1:6379> ft.create idx schema t text SORTABLE NOINDEX n text
OK
127.0.0.1:6379> hset doc:1 t 1
(integer) 1
127.0.0.1:6379> ft.search idx * 
1) (integer) 0
127.0.0.1:6379> hset doc:1 t 1 n 1
(integer) 1
127.0.0.1:6379> ft.search idx * 
1) (integer) 1
2) "doc:1"
3) 1) "t"
   2) "1"
   3) "n"
   4) "1"

@shohamazon
Copy link
Author

Thank you so much, you helped me a lot.
I would love to get notified when the bag gets fixed and/or the documentation improves.
Have a nice day:)

@meiravgri
Copy link
Collaborator

I linked the ticket to this issue so I believe that the PR will appear here

Copy link

github-actions bot commented Apr 2, 2024

This issue is stale because it has been open for 60 days with no activity.

@github-actions github-actions bot added the stale label Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants