-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParseOptions: ignore Picture
setting
#186
Comments
An addition to I'm curious though, how did you go about testing this? When items are encountered, they get read into memory before being interpreted, so there would be allocations regardless. |
This was commented out in the tests (it was a match before though). Lines 93 to 97 in 46127a0
I'm now seeing that the allocation is not even here. In my case, I'm iterating in a hot loop over 1000s of files so I guess the encoding + push time was adding up. If this read were prevented in Lines 32 to 37 in 46127a0
Something like a check before continuing after line 32 works but maybe is too invasive? Each What is the right way to go about this? I'd like to make this easy to merge. |
Yes, the allocation would be avoided completely if you just seek over the content in A nice way you could go about this would be to make let block = Block::read(reader, |block_type| {
block_type == BLOCK_ID_VORBIS_COMMENTS
|| (block_type == BLOCK_ID_PICTURE && parse_options.read_pictures)
}); Then just change if predicate(block.ty) {
// Read block content
} else {
// Seek over content and return empty Vec
} |
I will work on this and eventually submit an initial draft PR (probably in the next few weeks). |
Sorry. I won't be working on this. |
This should be a pretty easy feature to implement. I'll keep this open so I remember to eventually work on it. |
Consider the following situation:
You have
10,000
audio files you want to probe, totaling to around 500 albums:Currently, this fully scans every file, including the
Picture
bytes, meaning: every single file allocates aVec
for the picture bytes, even if we have already scanned them, or don't want them in the first place (e.g, we only want the core tag metadata).I'd like to avoid allocating bytes that I already have, so my proposal:
After testing around, this cuts probing time in half for files with non-trivial picture data (1200x1200+).
I'm willing to submit a PR for this, but I was wondering if this
ParseOptions::read_picture()
API was the correct way forward, or if this should be done another way. It would add another branch each file parse since every version ofread_from()
would have this added:The text was updated successfully, but these errors were encountered: