Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: the regex consumes an excessive amount of memory. #16664

Open
zwang28 opened this issue May 9, 2024 · 2 comments
Open

enhancement: the regex consumes an excessive amount of memory. #16664

zwang28 opened this issue May 9, 2024 · 2 comments
Labels
component/func-expr Support a SQL function or operator type/enhancement Improvements to existing implementation.
Milestone

Comments

@zwang28
Copy link
Contributor

zwang28 commented May 9, 2024

Describe the bug

The cluster has 1000+ MVs, where each MV contains a regex. The regex lib holds 70+GB memory consistently.

manual.heap.collapsed.zip

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

v1.8

backtrack_limit increased from 1_000_000 to 1_000_000_000

Additional context

No response

@zwang28 zwang28 added the type/bug Something isn't working label May 9, 2024
@github-actions github-actions bot added this to the release-1.10 milestone May 9, 2024
@BugenZhao
Copy link
Member

Each executor within the same operator will create its own expressions, which are usually identical but consume multiple resources. Moreover, if the same expression node is present in different operators or streaming jobs, it can also be reused.

We may consider perform some caching on different granularities:

  • cache the results of build_expr based on their protobuf input
  • cache the "context"s that are believed to be memory-consuming based on its own input, like RegexpContext in this case

@BugenZhao BugenZhao added type/enhancement Improvements to existing implementation. component/func-expr Support a SQL function or operator and removed type/bug Something isn't working labels May 10, 2024
@zwang28 zwang28 changed the title bug: the regex consumes an excessive amount of memory. enhancement: the regex consumes an excessive amount of memory. May 10, 2024
@zwang28
Copy link
Contributor Author

zwang28 commented May 10, 2024

Pls note that in this cluster I've raised backtrack_limit from 1_000_000 to 1_000_000_000 to work around Error::BacktrackLimitExceeded. So there's indeed very large regex state which requires sizable runtime memory.

Because the memory pool never contracts in size, it may explain the large memory footprint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/func-expr Support a SQL function or operator type/enhancement Improvements to existing implementation.
Projects
None yet
Development

No branches or pull requests

2 participants