-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-34166 Server could hang with BP < 80M under stress #3256
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks good. Did you check all other occurrences of BUF_LRU_MIN_LEN
? Should some of the other references be adjusted?
If the limit is being consulted often, we might consider introducing a global variable buf_pool.LRU_min_len
(protected by buf_pool.mutex
) and adjust it in buf_pool_t::resize()
.
423ac7a
to
9b4541b
Compare
Thanks for pointing it out. I did check the usage.
I had considered this option and decided otherwise because ...
Since the scenario affects only limited (< 80M BP) cases, I think it is much better to consider them case by case and modify only if there is any visible issue with it. |
BUF_LRU_MIN_LEN (256) is too high value for low buffer pool(BP) size. For example, for BP size lower than 80M and 16 K page size, the limit is more than 5% of total BP and for lowest BP 5M, it is 80% of the BP. Non-data objects like explicit locks could occupy part of the BP pool reducing the pages available for LRU. If LRU reaches minimum limit and if no free pages are available, server would hang with page cleaner not able to free any more pages. Fix: To avoid such hang, we adjust the LRU limit lower than the limit for data objects as checked in buf_LRU_check_size_of_non_data_objects() i.e. one page less than 5% of BP.
9b4541b
to
b2944ad
Compare
Description
BUF_LRU_MIN_LEN (256) is too high value for low buffer pool(BP) size. For example, for BP size lower than 80M and 16 K page size, the limit is more than 5% of total BP and for lowest BP 5M, it is 80% of the BP. Non-data objects like explicit locks could occupy part of the BP pool reducing the pages available for LRU. If LRU reaches minimum limit and if no free pages are available, server would hang with page cleaner not able to free any more pages.
Fix: To avoid such hang, we adjust the LRU limit lower than the limit for data objects as checked in buf_LRU_check_size_of_non_data_objects() i.e. one page less than 5% of BP.
Release Notes
This could happen in rare case with BP size < 80M. Too many lock objects created with UPDATE, DELETE, INSERT INTO SELECT from same TABLE with queries over large range in RR or Serializable isolation could leads to the issue.
How can this PR be tested?
./mtr innodb.lock_memory_debug
Basing the PR against the correct MariaDB version
PR quality check