Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible performance issues in the database driver #467

Open
mrbig opened this issue Feb 5, 2023 · 3 comments
Open

Possible performance issues in the database driver #467

mrbig opened this issue Feb 5, 2023 · 3 comments

Comments

@mrbig
Copy link

mrbig commented Feb 5, 2023

Version: 2.3.5

Hello,

during the review of the code I encountered this issue, that could be a problem for someone with a lot of jobs in the queue: the yii\queue\db\Queue driver has these two queries:

andWhere('[[pushed_at]] <= :time - [[delay]]', [':time' => time()]

'[[reserved_at]] is not null and [[reserved_at]] < :time - [[ttr]] and [[done_at]] is null',

Problem with these two is that there is an operation on the rights side, that is dependent on a value from the record. This makes the index on those columns pointless, because the database has to examine all records that are not filtered out by other filters in the query.

Please be advised that this comment is wrong in the code: "reserved_at IS NOT NULL forces db to use index on column, otherwise a full scan of the table will be performed" => actually what this does is that the index will only be used to get to all of the records that have a not null reserved_at value, but will have to scan after this all those. If they are scattered over a large table then this can be sometimes even slower than a full table scan. So if one has lots of records with a not null reserved_at value, then the query will still be slow.

I would recommend refactoring the driver to use a reserved_until value instead of a reserved_at. By setting an explicit timestamp when a job can be considered for execution these queries would stay quick even with millions of records in a table.

I understand that this is a breaking change for some, while not an issue for the most of the users. Please consider this request in future versions.

Thank you.

@samdark
Copy link
Member

samdark commented Feb 6, 2023

Thanks for the analysis. If you have time for a pull request, that would be awesome.

@samdark samdark added the type:enhancement Enhancement label Feb 6, 2023
@nadar
Copy link
Contributor

nadar commented Apr 24, 2023

Could this relate to #452?

@ghost
Copy link

ghost commented Jun 10, 2023

Both select statement and move expired statement seems weird to me

SELECT * FROM QUEUE WHERE channel = "" AND reserved_at IS NULL AND pushed_at <= :time ORDER BY priority ASC, id ASC limit 1;

  1. Select statement above would use either channel / reserved_at index, and using file sort for priority & id, might lead to performance issues when the table contain large chunk of record.

Filter by pushed_at is a question marks for me also, any reason that we need to apply this filter?

UPDATE queue SET reserved_at=NULL WHERE reserved_at is not null and reserved_at < 1649649756 - ttr and done_at is null;

  1. since the reserved_at < 1649649756 - ttr require to use ttr field for calculation, index will only use to filter out those record with reserved_at IS NOT NULL, might be concern if the table contain lots of data

I would suggest to implement changes as per @mrbig suggest

  1. Use reserved_until instead of reserved_at, and index reserved_until to make sure that the statement is optimised
  2. Using composite index covering select statement (channel, reserved_at, priority, id)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants