[Feature Request] Hashing prioritisation #1058

Reinachan · 2023-04-14T16:23:18Z

Current behaviour seems random, however, people usually watch anime in sequential order, so it would make sense to prioritise hashing files sequentially based on the episode number when possible.

I suggest that for files with similar names where the only difference is a number, ShokoAnime should prioritise files with a lower number before those with higher numbers. If it's unable to determine the episode number, it should do things the way it's currently doing it.

revam · 2023-04-14T16:25:10Z

Context;

Shoko doesn't care about filenames at all.
The files are processed in the order they are discovered in (with some exceptions).

revam · 2023-04-14T16:29:42Z

I'm not against adding a bit more "predictability" to the process, but i also don't see the benefit of adding this behaviour. Others on the team might see it differently though.

Reinachan · 2023-04-14T16:30:48Z

Context;

Shoko doesn't care about filenames at all.

The files are processed in the order they are discovered in (with some exceptions).

That's what I assumed. I had that issue with my fileserver when reconstructing chunked uploads and ended up fetching filenames first and then initialise the process of reconstructing the file.

I'd suggest doing something similar for Shoko. First grab the filenames, check for prioritisation, then run the hasher.

bigretromike · 2023-04-14T16:33:43Z

The files are processed in the order they are discovered in (with some exceptions).

If that would be true, wouldn't same series be hashed in order as they should be in same directory ? (assuming they are)

revam · 2023-04-14T16:41:09Z

The files are processed in the order they are discovered in (with some exceptions).

If that would be true, wouldn't same series be hashed in order as they should be in same directory ? (assuming they are)

Only if they are discovered in sequential order.

Reinachan · 2023-04-14T17:05:48Z

If that would be true, wouldn't same series be hashed in order as they should be in same directory ? (assuming they are)
@bigretromike

Assuming C# (or whatever library) is using the same APIs under the hood as Rust does, that's not the case, no.

This function currently corresponds to the opendir function on Unix and the FindFirstFile function on Windows. Advancing the iterator currently corresponds to readdir on Unix and FindNextFile on Windows. [...]

The order in which this iterator returns entries is platform and filesystem dependent.
(source)

That said, this is only an issue when the server is recieving a directory, not when it recieves individual files (like if you're downloading the episodes separately). Idk if those are distinguishable events for the server or not.

Cazzar · 2023-04-14T17:29:16Z

Ultimately without being stupidly slow in file discovery I don’t feel this will be that viable, and there is the difference between the full file tree scan and the filesystem watcher, once the commands are in the queue, they may be processed typically in order of priority then last updated, but that could change.

we don’t have any sorting currently as to do that we would need to load the entire import folder tree into memory before sorting and such a situation will lead quickly into poor performance in larger collections, and we already have a large memory footprint

maxpiva · 2023-04-14T18:19:28Z

You mean for initial import or forced rescan? Because after that the system do ingests from file system watcher events When the file system watcher detect new files, the order of the import is usually the order you store/copy/move your files in there. We cannot sort something that is not in directory yet. The only case is when you move an entire directory into from the same physical location, which is almost immediately otherwise the system will copy one file at the time, every new file will trigger the event, and the import. El El vie, 14 de abr. de 2023 a la(s) 13:23, Nina Louise < ***@***.***> escribió:

…

Current behaviour seems random, however, people usually watch anime in sequential order, so it would make sense to prioritise hashing files sequentially based on the episode number when possible. I suggest that for files with similar names where the only difference is a number, ShokoAnime should prioritise files with a lower number before those with higher numbers. If it's unable to determine the episode number, it should do things the way it's currently doing it. — Reply to this email directly, view it on GitHub <#1058>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI4G3MGBUYV2ZDJ7IHMTJTXBF2YFANCNFSM6AAAAAAW6T7SLM> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Reinachan · 2023-04-16T15:38:36Z

I primarily mean on filewatcher events when you put a directory into the import folder. The way I have things set up is that once an anime is fully downloaded, it'll hardlink the containing folder into the Shoko import folder.

I don't think this should be done on an initial import, nor on individual files placed into the import folder. Only when a directory with multiple anime in it is placed into the import folder. You could also make it optional.

Basically, on filesystem event directiry, read entries in directory, determine sorting, perform in that order.

As for memory footprint, I personally don't mind short spikes of increased memory. You can mark the setting as "potentially memory intensive during imports" if it turns out to be a problem.

da3dsoul · 2023-04-16T15:54:19Z

That could be done, since a directory detection is unique from a file detection

maxpiva · 2023-04-16T16:05:12Z

While it could do directory events. Your use case expect the directory appear instantly with their files in the import location, for that specific use case hard link or move directory in the same physical location. It could be done. But if the user copies a directory into the import location. Files are copied one by one, and import order will be the order the system copies the files inside. If your fine with that I think it could be done. El El dom, 16 de abr. de 2023 a la(s) 12:38, Nina Louise < ***@***.***> escribió:

…

I primarily mean on filewatcher events when you put a directory into the import folder. The way I have things set up is that once an anime is fully downloaded, it'll hardlink the containing folder into the Shoko import folder. I don't think this should be done on an initial import, nor on individual files placed into the import folder. Only when a directory with multiple anime in it is placed into the import folder. You could also make it optional. Basically, on filesystem event directiry, read entries in directory, determine sorting, perform in that order. As for memory footprint, I personally don't mind short spikes of increased memory. You can mark the setting as "potentially memory intensive during imports" if it turns out to be a problem. — Reply to this email directly, view it on GitHub <#1058 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI4G3LH7V34SACA4V56O5LXBQHAPANCNFSM6AAAAAAW6T7SLM> . You are receiving this because you commented.Message ID: ***@***.***>

ElementalCrisis added the Enhancement - Feature Improvement label Apr 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Hashing prioritisation #1058

[Feature Request] Hashing prioritisation #1058

Reinachan commented Apr 14, 2023

revam commented Apr 14, 2023 •

edited

revam commented Apr 14, 2023 •

edited

Reinachan commented Apr 14, 2023

bigretromike commented Apr 14, 2023

revam commented Apr 14, 2023

Reinachan commented Apr 14, 2023

Cazzar commented Apr 14, 2023

maxpiva commented Apr 14, 2023 via email

Reinachan commented Apr 16, 2023

da3dsoul commented Apr 16, 2023

maxpiva commented Apr 16, 2023 via email

[Feature Request] Hashing prioritisation #1058

[Feature Request] Hashing prioritisation #1058

Comments

Reinachan commented Apr 14, 2023

revam commented Apr 14, 2023 • edited

revam commented Apr 14, 2023 • edited

Reinachan commented Apr 14, 2023

bigretromike commented Apr 14, 2023

revam commented Apr 14, 2023

Reinachan commented Apr 14, 2023

Cazzar commented Apr 14, 2023

maxpiva commented Apr 14, 2023 via email

Reinachan commented Apr 16, 2023

da3dsoul commented Apr 16, 2023

maxpiva commented Apr 16, 2023 via email

revam commented Apr 14, 2023 •

edited

revam commented Apr 14, 2023 •

edited