Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedup Large Number of Playlists #26

Open
shelaffs opened this issue Oct 22, 2018 · 11 comments
Open

Dedup Large Number of Playlists #26

shelaffs opened this issue Oct 22, 2018 · 11 comments

Comments

@shelaffs
Copy link

When processing a large number of playlists (counter starts at 100, I have 92 user playlists, +9 Spotify defaults) the search always seems to get stuck at 18 playlists remaining to process, even when allowing for a significant amount of time to pass (10+min - 1 hour).

The search does process some playlists but does not respond when "remove duplicates" button is pressed on those that are processed, but the duplicates do appear to get removed.

The issue has been present for at least 3 weeks.

@JMPerez
Copy link
Owner

JMPerez commented Oct 22, 2018

Could you share a link to the playlist that gets stuck?

Also, if it does delete duplicates, does it mean it eventually doesn’t find duplicates if you run it several times?

@JMPerez
Copy link
Owner

JMPerez commented Oct 22, 2018

Just for completeness, there are corner cases that are difficult to fix because they are hard to reproduce.

@shelaffs
Copy link
Author

I understand. I'm not sure what playlist causes it to get stuck. I have several with 5,000+ songs and it's never been an issue (so far as I could tell) until a few weeks ago. I did test with deleting 4 playlists to put me under 100 total again, but it seems to persist in hanging at 18 playlists left to process. I have also tried rearranging the playlists. Perhaps it is indeed the large playlists or the number of large playlists that I have?

Typical results:

Starts at "still to process 100 playlists" but quickly (<2 seconds) goes down to 85
2018-10-22_14h19_59

~1 minute later displays "still to process 18 playlists" and hangs until close out of the web page
2018-10-22_14h21_04

Testing:

Tested by adding a single duplicate to one of the last (small) playlists in my list, and it did appear, but the "18 playlists" issue still displays:
2018-10-22_14h59_02

I noticed the button did respond in this case and the duplicate was removed, but there is still the "18 playlists" issue:
2018-10-22_15h03_50

I tested by adding a duplicate to one of my large playlists (>5,000 songs) and this did not get pulled by the dedup process, even after waiting for over 10 minutes.
2018-10-22_15h09_16

I added a duplicate to my saved songs as this has been reliable in the past, and it did pull that duplicate despite the "playlist" having over 6,000 tracks.
2018-10-22_15h17_49

If it's due to the size of the user playlists, I can accept that, but wanted you to be aware of the issue as I do not recall it being an issue until just the past few weeks, and I have had playlists this large for a while and never noticed it causing a problem (other than Spotify's 10,000 track limit).

I appreciate you looking into this!

@shelaffs
Copy link
Author

Additional Testing:
Browsers seem to handle the information differently as well.

Firefox (Windows 10) jumps from 100 playlists left to process to 85 to 18 and hangs there (my usual browser)

Chrome (Windows 10) shows each number as it checks the playlists from 100 and hangs at 21 left to process (both desktop and iphone SE mobile app)

Safari (iphone SE mobile) performs similarly to Chrome, and hangs at 21 as well

Thanks!

@JMPerez
Copy link
Owner

JMPerez commented Oct 22, 2018

Wow, thank you so much for the detailed report, @shelaffs!

I tried finding out where the problem was by checking at Sentry, the error reporting tool I use, but I can't see any recent error there. I could try to reproduce the error if we knew what playlist was causing the error and then I created one with the same content, but for that we need to identify the faulty one. If you could access the network panel in the developer tools of your browser there should be an error that could help debugging this.

I think the best way to solve this would be to add additional error reporting and error handling and then try again. My guess is that there is an error fetching tracks from one of the playlists. Looking at the code there is no error handling in the calls to getTracks(), so if one of the requests for a page of tracks doesn't succeed the whole thing breaks.

I'll try to find some time to add more error handling that can display on the page that there was a problem with a certain playlist.

@shelaffs
Copy link
Author

Okay I ran it in Firefox and Chrome. I don't know how to download the report but everything looked fine in Firefox except for a 500 error in the very beginning that didn't seem to affect much.
It did run for almost 10 minutes and transferred 1gb of data which seems quite high but I'm not positive of the norm.
2018-10-23_14h07_03

I then ran it in Chrome and it transferred much less data and seemed to return a good number of errors
2018-10-23_14h24_27
They were all the same error it appears but for different playlists.

The playlists affected in order of the errors were:
https://open.spotify.com/playlist/4cQ71zUu9D5MHa7eglNbSt
https://open.spotify.com/playlist/1uwoG5mqwu12MIHMCLcvuj
https://open.spotify.com/playlist/6zwMDTtbSOFZ1xeJq4o7ng
https://open.spotify.com/playlist/5zHaGzuo56Yq3mug0I9o6F
https://open.spotify.com/playlist/1QOJ8soBpYObekO37nhla7 (2 errors)
https://open.spotify.com/playlist/6zwMDTtbSOFZ1xeJq4o7ng
https://open.spotify.com/playlist/5RsQ3MP1tqT7Vo8vtb7Tt2 (2 errors)
https://open.spotify.com/playlist/2ZXXiDodQdYu6tEkRxlJ4F
https://open.spotify.com/playlist/1rKpwnFtmsZGBImJMiZzk0 (4 errors)
https://open.spotify.com/playlist/3b3zqaFlRrjaw3ewD0cGXo (2 errors)
https://open.spotify.com/playlist/1BgqsEh1vR85NWCwapobAE

These are all large playlists that I add tracks to almost daily, so I'm not at all surprised to see the errors coming from these. Is there a way to isolate the tracks that are returning the error?

The error appeared to be identical or very similar for each one that was returned
2018-10-23_14h35_40

Thank you!

@JMPerez
Copy link
Owner

JMPerez commented Oct 24, 2018

I tried to find duplicates in https://open.spotify.com/playlist/4cQ71zUu9D5MHa7eglNbSt from my account and it found duplicated tracks without throwing any error.

I plan to wrok on error handling to notify when something goes wrong and make the tool "continue" if there is an eventual issue fetching the list of tracks for a playlist. This will take some time though so we shouldn't expect an easy fix.

@shelaffs
Copy link
Author

I appreciate you looking into it. Just as one last test, I created a new playlist and dragged all tracks from the effected playlist with 4 errors (also the most tracks) and it did not get pulled for duplicates either, so I don't think recreating the playlists would help, but I was hopeful.

Thanks again for all your help, it's a fantastic tool =)

@JMPerez
Copy link
Owner

JMPerez commented Oct 25, 2018

Thanks a lot to you! These conversations are what make me feel that these projects are useful and encourage me to continue working on them.

I’ll update the thread when I have any update. In the meantime I’m cleaning up some code and improving testing and error handling for part of the app.

@shelaffs
Copy link
Author

2018-11-13_09h50_15

Hello, I just wanted to report that I tried running Spotify Dedup today (left it alone for a while) and it is once again pulling duplicates from my playlists and removing them. Thanks so much for the great work!

@JMPerez
Copy link
Owner

JMPerez commented Nov 13, 2018

Thanks for letting me know @shelaffs! I've been improving the code a little bit to do better error handling but there are still a few things to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants