Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM Index slow while referencing JSONB column #59

Open
ngigiwaithaka opened this issue May 23, 2019 · 0 comments
Open

RUM Index slow while referencing JSONB column #59

ngigiwaithaka opened this issue May 23, 2019 · 0 comments

Comments

@ngigiwaithaka
Copy link

Suppose you have this table

> CREATE TABLE news_items (id int, object_data JSONB, weighted_tsv tsvector, time_created_from_server bigint);

Also, supposing the object_data jsonb has the following fields:
title, article, author and other fields(10)...

I have created the rum index article_rum_idx as follows...

> CREATE INDEX news_item_rum_english_weighted_tsv_idx ON news_items
>   USING rum ( weighted_tsv rum_tsvector_addon_ops,	time_created_as_from_server)
>     WITH (attach = 'time_created_as_from_server', to = 'weighted_tsv');

When I query while referencing object_data and order by rank as follows:

> EXPLAIN ANALYZE
> select object_data,  weighted_tsv <=> to_tsquery('English', 'Uhuru') rank            
> from news_items 
> where
> weighted_tsv @@ websearch_to_tsquery('English', 'Uhuru')
> ORDER BY rank
> limit 10;

Query executes in 418ms

Screenshot from 2019-05-23 17-32-00

When I query while referncing object_data->'title' and order by rank as follows:

> EXPLAIN ANALYZE
> select object_data->'title',  weighted_tsv <=> to_tsquery('English', 'Uhuru') rank            
> from news_items 
> where
> weighted_tsv @@ websearch_to_tsquery('English', 'Uhuru')
> ORDER BY rank
> limit 10;

Screenshot from 2019-05-23 17-31-27

Query executes in 1.7s

That is almost 5x slower on exact same query and is consistent amongst different search terms.

Executing the same query without the ORDER BY executes much faster in around 3ms and there is not much difference between the two.
Screenshot from 2019-05-23 17-41-55

In this case it uses an Index Scan as opposed to the Bitmap Scan while its ordering by Rank and I am wondering how I could structure the query so that it always uses the must faster Index Scan.

Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants