New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make queries fast, filter all flexible attributes #5240
base: master
Are you sure you want to change the base?
Conversation
This will be help with testing each of the documents which do not any more depend on the 'global' `current_app` and `request`. These two can now be provided at the time the objects are instantiated.
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
Use `json_group_object` SQLite function to aggregate flexible attributes into `flex_attrs` field. Register SQLite converter `json.loads` to automatically convert the JSON string to a Python dictionary. Remove the code that had this task previously.
For a flexible attribute query, replace the `col_name` property with a function call that extracts that attribute from the `field_attrs` field introduced in the earlier commit. Additionally, for boolean, numeric and date queries CAST the value to NUMERIC SQLite affinity to ensure that our queries like 'flex:1..5' and 'flex:true' continue working fine. This removes the concept of 'slow query', since every query for any field now has an SQL clause.
Unify query creation logic from - queryparse.py:construct_query_part, - Model.field_query, - DefaultTemplateFunctions._tmpl_unique to a single implementation under `LibModel.field_query` class method. This method should be used for query resolution for model (flex)fields. Allow filtering item attributes in album queries and vice versa by merging `flex_attrs` from Album and Item together as `all_flex_attrs`. This field is only used for filtering and is discarded after.
It seems like previously filtering by flexible attributes did not work - I'd receive '{"data": []}' trying to GET `/aura/tracks?filter[play_count]=11` Now this works, not only for tracks, but for `/aura/artists` and `/aura/albums` too. Additionally, this improves `/aura/tracks` response time significantly. I tried loading the default of 500 tracks from my library: On `master`, it took ~20s After this commit, it takes under 1s.
ad2ea5a
to
9ceffb6
Compare
I'm using the |
3c293fd
to
bdb7fd9
Compare
@@ -1019,7 +1019,7 @@ def find_duplicates(self, lib): | |||
# temporary `Item` object to generate any computed fields. | |||
tmp_item = library.Item(lib, **info) | |||
keys = config["import"]["duplicate_keys"]["item"].as_str_seq() | |||
dup_query = library.Album.all_fields_query( | |||
dup_query = library.Item.match_all_query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seemed to me this was supposed to be Item
instead of Album
?
I'll be able to review in a week or two, just end of semester push at the moment. |
In order to include the table name for fields in this query, use the `field_query` method. Since `AnyFieldQuery` is just an `OrQuery` under the hood, remove it and construct `OrQuery` explicitly instead.
bdb7fd9
to
070c87f
Compare
Hey, also just chiming in to say that it will take some time for me to go through the current batch of PRs. I won't be able to keep up the response times I had last week, but I'll slowly work through all of them. |
Description
Another and (hopefully) final attempt to improve querying speed.
Fixes #4360 #3515 and possibly more issues to do with slow queries.
This PR supersedes #4746.
What's been done
The
album
anditem
tables are joined, and corresponding data fromitem_attributes
andalbum_attributes
is merged and made available for filtering. This enables to achieve the following:beet list -a path::some/path
beet list play_count:10
beet list -a title:something
beet list artpath:cover
beet list -a art_source:something
Benchmarks
You can see that now querying speed is more or less constant regardless of the query, and the speed is mostly influenced by how many results need to be printed out
Compare this with what we had previously
To Do
docs/
to describe it.)docs/changelog.rst
near the top of the document.)Later
lslimit