Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query optimisations for massive libraries #478

Open
cascooscuro opened this issue Mar 4, 2024 · 21 comments
Open

query optimisations for massive libraries #478

cascooscuro opened this issue Mar 4, 2024 · 21 comments

Comments

@cascooscuro
Copy link

gonic version: v0.16.2
from source: 80a9aeb..88e58c0 master-> origin/master

running behind ngnix proxy

When browsing artists/album or when starting a new stream the response timeouts and have to click again, this second time the server answers instantly. There are no issues with the streams that are already playing.

This issue looks similar to #411 but I'm not using docker/jukebox.

The only thing I see in the logs are:

Mar 04 12:18:08 server gonic[3420077]: 2024/03/04 12:18:08 error writing subsonic response: write tcp 10.11.12.13:4747->10.11.12.13:35128: i/o timeout
Mar 04 12:18:08 server gonic[3420077]: 2024/03/04 12:18:08 response 200 GET /getMusicDirectory?c=Jamstash&callback=angular.callbacks._1c&f=jsonp&id=al-99067&p=enc:123456789&u=user&v=1.6.0

My library is somewhat extreme: 3M tracks, 300K folders.

Thanks for your help and for this great project.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

wow 3M tracks! can you try run with bba1a80 and see if you have any slow db queries? if a query takes longer than 5 seconds the http handler with timeout

@cascooscuro
Copy link
Author

Thanks for your quick response!

It looks I'm hitting that 5s timeout:

Mar 05 19:10:09 server gonic[1522298]: 2024/03/05 19:10:09 response  0  OPTIONS /getMusicDirectory.view?c=Jamstash&callback=angular.callbacks._3z&f=jsonp&id=al-98576&p=enc:12345678901234567&u=user&v=1.6.0
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/db/db.go:1162.088643msSELECT * FROM "users"  WHERE (name=?) ORDER BY "users"."id" ASC LIMIT 1[user] 1
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:761.232106msSELECT * FROM "albums"  WHERE ("albums"."id" = 98576) ORDER BY "albums"."id" ASC LIMIT 1[] 1
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:76789.401µsSELECT * FROM "album_stars"  WHERE ("album_id" IN (?)) AND (user_id=?) ORDER BY "album_stars"."user_id" ASC[98576 2] 0
Mar 05 19:10:09 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:76925.156µsSELECT * FROM "album_ratings"  WHERE ("album_id" IN (?)) AND (user_id=?) ORDER BY "album_ratings"."user_id" ASC[98576 2] 0
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:844.659379491sSELECT * FROM "albums"  WHERE (parent_id=?) ORDER BY albums.right_path COLLATE NOCASE[98576] 6229
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:4493.387870662sSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:1092.421367091sSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:11.632248246 +0100 CET m=-2591903.976856681] 0
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:8436.93328msSELECT * FROM "album_stars"  WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,..<snip>... AND (user_id=?)[102807 103933 105783 106052 106544 98999...
Mar 05 19:10:14 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:846.882454msSELECT * FROM "album_ratings"  WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?..<snip>... AND (user_id=?)[100243 100270 101787 102513 100559 102037 103125 256878 102864 103074 103654 98968 99165 100271 102334 104915 104969 105638 105840 98921 100269 103921...
Mar 05 19:10:14 server gonic[1522298]:  106279 100658 100931 101633 1
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:986.643003428sSELECT * FROM "tracks"  WHERE (album_id=?) ORDER BY "filename"[98576] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:4495.090854123sSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:1091.152361334sSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:19.63362109 +0100 CET m=-2591895.975483814] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:9863.126197msSELECT * FROM "artists" INNER JOIN "album_artists" ON "album_artists"."artist_id" = "artists"."id" WHERE (1 <> 1)[] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:44962.965328msSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_by_folder.go:98178.219µsSELECT * FROM "artists" INNER JOIN "track_artists" ON "track_artists"."artist_id" = "artists"."id" WHERE (1 <> 1)[] 0
Mar 05 19:10:20 server gonic[1522298]: gorm sql/gonic/server/ctrlsubsonic/handlers_raw.go:261520.187µsSELECT * FROM "transcode_preferences"  WHERE (user_id=?) AND (client COLLATE NOCASE IN (?,?)) ORDER BY client DESC LIMIT 1[2 * Jamstash] 1
Mar 05 19:10:20 server gonic[1522298]: 2024/03/05 19:10:20 error writing subsonic response: write tcp 10.11.12.13:4747->10.11.12.13:45596: i/o timeout
Mar 05 19:10:20 server gonic[1522298]: 2024/03/05 19:10:20 response  200  GET /getMusicDirectory?c=Jamstash&callback=angular.callbacks._3z&f=jsonp&id=al-98576&p=enc:12345678901234567&u=user&v=1.6.0
Mar 05 19:10:25 server gonic[1522298]: gorm sql/gonic/podcast/podcast.go:449476.721µsSELECT * FROM "podcast_episodes"  WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0

@cascooscuro
Copy link
Author

I did a quick test increasing timeout vars in cmd/gonic/gonic.go:

        var (
                readTimeout  = 10 * time.Second
                writeTimeout = 10 * time.Second
                idleTimeout  = 10 * time.Second
        )

.. and so far looks good. Not sure if it's the correct way to fix it.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

ah looks like the artist info query is taking 10.94 seconds

maybe i can optimise that query. would you mind emailing me a copy of your database? you can remove personal info from it like

$ cp gonic.db backup.db
$ sqlite3 backup.db "delete from users"

then email me the backup

@brian-doherty
Copy link
Contributor

Could it be a cache miss and then a delay from last.fm? I guess probably not if it happens to manifest for the guy with 3M tracks but I thought I'd throw it out there.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

@brian-doherty it seems to be the query used to find the next artist info that needs to be refreshed

Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0

you can see the query is taking 10sec (10935.083355ms), in that time the database is locked so other requests will timeout waiting for the db to be unlocked

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

also seems like we're doing a scan
image

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

does anyone know how we could remove the scan on artists?
https://www.db-fiddle.com/f/miGsKWi3oZBBdjkejeAdgM/1

the query runs every N seconds to find artists with no or stale info so needs to be pretty snappy

@brian-doherty
Copy link
Contributor

Scan or not, when you run this query you're potentially asking for a very large number of rows very frequently, even though you're keeping the info cached for 30 days. Would it work better if you asked it to only return the n oldest, and handled those, and left the rest for the next refresh cycle?

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

that query should just return one row (there a LIMIT 1 on the end) and in the application it's a First()

@brian-doherty
Copy link
Contributor

My bad. Maybe you need an index on updated_at? Then it can go straight to the record with the lowest value instead of searching for one that's below.

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

that's what i was thinking too, there is an index there on line 7, still no luck

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

i've also asked on the ##sqlite IRC channel on libera, they're usually quite helpful in there

@brian-doherty
Copy link
Contributor

What if you order by updated_at, instead of id? Then it doesn't have to search at all?

@sentriz
Copy link
Owner

sentriz commented Mar 5, 2024

ooh it seems to be different. at least the scan is a bit lower. dunno if that's better https://www.db-fiddle.com/f/miGsKWi3oZBBdjkejeAdgM/3

i can test it though if @cascooscuro sends the DB

@brian-doherty
Copy link
Contributor

brian-doherty commented Mar 6, 2024 via email

@cascooscuro
Copy link
Author

DB file shared. Thanks for your help!

@sentriz
Copy link
Owner

sentriz commented Mar 6, 2024

haha i just tested it, it's actually quite quick

Mar 05 19:10:27 server gonic[1522298]: gorm sql/gonic/infocache/artistinfocache/artistinfocache.go:10935.083355msSELECT "artists".* FROM "artists" LEFT JOIN artist_infos ON artist_infos.id=artists.id WHERE (artist_infos.id IS NULL OR artist_infos.updated_at<?)[2024-02-04 19:10:27.631942681 +0100 CET m=-2591887.977162288] 0

the log output is a bit silly, this is actually artistinfocache.go:10935.083355ms is actually 35ms on line 109 😁

@sentriz
Copy link
Owner

sentriz commented Mar 6, 2024

identified two more slow queries on @cascooscuro's database that are near the 5s timeout

4.12s   SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration FROM "albums" LEFT JOIN tracks ON tracks.album_id=albums.id JOIN album_artists ON album_artists.album_id=albums.id  GROUP BY albums.id ORDER BY created_at DESC LIMIT 50 OFFSET 0
4.13s   SELECT artists.*, count(albums.id) album_count FROM "artists" JOIN artist_stars ON artist_stars.artist_id=artists.id JOIN artist_appearances ON artist_appearances.artist_id=artists.id JOIN albums ON albums.id=artist_appearances.album_id WHERE (artist_stars.user_id=?) GROUP BY artists.id ORDER BY artist_stars.star_date DESC

@brian-doherty
Copy link
Contributor

Now those are queries I'd expect to be slow. :D Start caching the sums and counts in the DB?

@sentriz
Copy link
Owner

sentriz commented Mar 6, 2024

could help but seems like the order by is the biggest offender, by far

image

this is after adding an index on created_at

https://www.db-fiddle.com/f/vQKEw9wCovnzqCoZTfnjod/1

@sentriz sentriz changed the title streaming keeps working but Artist/Album browsing stops query optimisations for massive libraries Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants