Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM Index WITH rum_tsvector_addon_ops does not work for jsonb columns #58

Open
ngigiwaithaka opened this issue May 22, 2019 · 7 comments

Comments

@ngigiwaithaka
Copy link

Suppose you have this table

CREATE TABLE articles (id int, object_data JSONB);

If object_data has two fields (article & timeCreatedAsFromServer)

The following returns the error
ERROR: attribute "((object_data -> 'timeCreatedAsFromServer')::bigint)" is not found in table

CREATE INDEX article_rum_idx ON articles
  USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops,	 ((object_data -> 'timeCreatedAsFromServer')::bigint))
  WITH (attach = '((object_data -> ''timeCreatedAsFromServer'')::bigint)', to = 'to_tsvector(''English'', object_data->>''article'')')

If you omit the WITH clause, it works as below!

CREATE INDEX article_rum_idx ON articles
  USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops,	 ((object_data -> 'timeCreatedAsFromServer')::bigint))
@ngigiwaithaka
Copy link
Author

Sorry, on running this

CREATE INDEX article_rum_idx ON articles
  USING rum (to_tsvector('English', object_data->>'article') rum_tsvector_addon_ops, ((object_data -> 'timeCreatedAsFromServer')::bigint))

You get...

ERROR: additional information attribute "to_tsvector" is not found in index

@za-arthur
Copy link
Contributor

Hello @ngigiwaithaka ,

Thank you for the issue! Unfortunately it isn't possible to attach exression's result as a additional information nowadays.

attach and to options in WITH expression requires column names of a tables to be passed. That is why you get ERROR: attribute "((object_data -> 'timeCreatedAsFromServer')::bigint)" is not found in table.

Also it is necessary to pass attach and to options if you use rum_tsvector_addon_ops opclass. If you don't pass it you get additional information attribute "to_tsvector" is not found in index (your second case).

@ngigiwaithaka
Copy link
Author

@za-arthur Thanks for prompt reply.

Any plans to have this feature included anytime soon?

@za-arthur
Copy link
Contributor

There was no such plan until today :)
It depends not only from me. We will discuss implementation of this feature.

@ngigiwaithaka
Copy link
Author

:-) my bad....

I have also posting something else I have come across, hope they wouldn't be too many for you today... :-)

@za-arthur
Copy link
Contributor

Of course not! You are welcome to create issues.

@zilzila
Copy link

zilzila commented Oct 16, 2023

This problem occurs not only with jsonb, but also in a simpler case - if you try to index not by the [additional] tsvector field, but by the text field, which is sad, since you actually have to duplicate a huge field, which doubles the size of an already rather big table. And the index for this redundant tsvector field is almost a threefold increase in the table as a result


Эта проблема встречается не только с jsonb, но и в более простом случае - если пытаться индексировать не по [дополнительному] полю tsvector, а по текстовому полю, что печально, так как приходится фактически дублировать огромное поле, что увеличивает размер и так немаленькой таблицы вдвое. А ещё индекс по этому избыточному полю tsvector - почти трёхкратное увеличение таблицы в итоге

create index on ... using rum (to_tsvector('lang', text_column) rum_tsvector_addon_ops, ts)
  with (attach = 'ts', to = 'to_tsvector(''lang'', text_column)');

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants