Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make queries fast, filter all flexible attributes #5240

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

snejus
Copy link
Member

@snejus snejus commented May 9, 2024

  • Make LazyClassProperty / cached_classproperty reusable
  • Add support for filtering relations
  • Add ability to debug queries
  • Fix querying fields present in both tables
  • Aggregate flexible attributes
  • Add ability to filter flexible attributes through the Query
  • Enable querying related flexible attributes
  • Remove slow lookups from beetsplug/aura

Description

Another and (hopefully) final attempt to improve querying speed.

Fixes #4360 #3515 and possibly more issues to do with slow queries.

This PR supersedes #4746.

What's been done

The album and item tables are joined, and corresponding data from item_attributes and album_attributes is merged and made available for filtering. This enables to achieve the following:

  • Faster album path queries, beet list -a path::some/path
  • Faster flexible attributes queries, both albums and tracks, beet list play_count:10
  • (New) Ability to filter albums with track-level (and vice-versa) db field queries, beet list -a title:something
  • (New) Ability to filter tracks with album-level flexible field queries, beet list artpath:cover
  • (New) Ability to filter albums with track-level flexible field queries, beet list -a art_source:something

Benchmarks

image

You can see that now querying speed is more or less constant regardless of the query, and the speed is mostly influenced by how many results need to be printed out
image

Compare this with what we had previously
image

To Do

  • Documentation. (If you've add a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • Changelog. (Add an entry to docs/changelog.rst near the top of the document.)
  • Tests. (Encouraged but not strictly required.)

Later

  • Submit PR with the corresponding adjustment for sorting and fix for lslimit
  • Submit PR with the corresponding adjustment for template variables resolution

snejus added 6 commits May 7, 2024 19:55
This will be help with testing each of the documents which do not
any more depend on the 'global' `current_app` and `request`. These two
can now be provided at the time the objects are instantiated.
@snejus snejus self-assigned this May 9, 2024
@snejus snejus requested a review from wisp3rwind May 9, 2024 11:36
Copy link

github-actions bot commented May 9, 2024

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

snejus added 5 commits May 9, 2024 12:40
Use `json_group_object` SQLite function to aggregate flexible attributes
into `flex_attrs` field.

Register SQLite converter `json.loads` to automatically convert the JSON
string to a Python dictionary. Remove the code that had this task
previously.
For a flexible attribute query, replace the `col_name` property with
a function call that extracts that attribute from the `field_attrs`
field introduced in the earlier commit.

Additionally, for boolean, numeric and date queries CAST the value to
NUMERIC SQLite affinity to ensure that our queries like 'flex:1..5' and
'flex:true' continue working fine.

This removes the concept of 'slow query', since every query for any
field now has an SQL clause.
Unify query creation logic from
- queryparse.py:construct_query_part,
- Model.field_query,
- DefaultTemplateFunctions._tmpl_unique

to a single implementation under `LibModel.field_query` class method.
This method should be used for query resolution for model (flex)fields.

Allow filtering item attributes in album queries and vice versa by
merging `flex_attrs` from Album and Item together as `all_flex_attrs`.
This field is only used for filtering and is discarded after.
It seems like previously filtering by flexible attributes did not work
- I'd receive '{"data": []}' trying to GET `/aura/tracks?filter[play_count]=11`

Now this works, not only for tracks, but for `/aura/artists` and
`/aura/albums` too.

Additionally, this improves `/aura/tracks` response time significantly.
I tried loading the default of 500 tracks from my library:

On `master`, it took ~20s
After this commit, it takes under 1s.
@snejus
Copy link
Member Author

snejus commented May 9, 2024

I'm using the test-aura branch as the base since it depends on it getting merged. Though for now, I will change the base to master in order to run the tests.

@snejus snejus changed the base branch from test-aura to master May 9, 2024 21:05
@@ -1019,7 +1019,7 @@ def find_duplicates(self, lib):
# temporary `Item` object to generate any computed fields.
tmp_item = library.Item(lib, **info)
keys = config["import"]["duplicate_keys"]["item"].as_str_seq()
dup_query = library.Album.all_fields_query(
dup_query = library.Item.match_all_query(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed to me this was supposed to be Item instead of Album?

@Serene-Arc
Copy link
Contributor

I'll be able to review in a week or two, just end of semester push at the moment.

In order to include the table name for fields in this query, use the
`field_query` method.

Since `AnyFieldQuery` is just an `OrQuery` under the hood, remove it and
construct `OrQuery` explicitly instead.
@wisp3rwind
Copy link
Member

Hey, also just chiming in to say that it will take some time for me to go through the current batch of PRs. I won't be able to keep up the response times I had last week, but I'll slowly work through all of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

beet list -a path:some/path queries are extremely slow
3 participants