ChecksumVerificationFailed on read of many files in solid archive #31

Revertron · 2023-08-12T12:25:58Z

I have solid archives with block size of 16Mb. And many of the files fail to read because of ChecksumVerificationFailed.

Example archive: https://up.revertron.com/Memes.7z

Example code:

pub fn test_blocks() {
    let mut buf = Vec::new();

    let mut archive = SevenZReader::open("Memes.7z", Password::empty()).expect("Error opening 7z archive");
    let _ = archive.for_each_entries(|entry, reader| {
        println!("Reading file {}", &entry.name);
        if "FcGD7nuX0AgQNS_.jpg" == entry.name {
            println!("*** Found file {}", &entry.name);
            match reader.read_to_end(&mut buf) {
                Ok(_size) => {
                    println!("Have read file {}", &entry.name);
                    return Ok(false);
                }
                Err(e) => {
                    println!("Error reading file {}: {}", &entry.name, &e);
                    return Err(sevenz_rust::Error::from(e));
                }
            }
        }
        Ok(true)
    });
    assert!(!buf.is_empty())
}

The text was updated successfully, but these errors were encountered:

dyz1990 · 2023-08-12T13:14:43Z

You can't skip reading these entries, even if you don't need them.
Try this code:


pub fn test_blocks() {
    let mut buf = Vec::new();

    let mut archive =
        SevenZReader::open("Memes.7z", Password::empty()).expect("Error opening 7z archive");
    let _ = archive.for_each_entries(|entry, reader| {
        println!("Reading file {}", &entry.name);
        if "FcGD7nuX0AgQNS_.jpg" == entry.name {
            println!("*** Found file {}", &entry.name);
            match reader.read_to_end(&mut buf) {
                Ok(_size) => {
                    println!("Have read file {}", &entry.name);
                    return Ok(false);
                }
                Err(e) => {
                    println!("Error reading file {}: {}", &entry.name, &e);
                    return Err(sevenz_rust::Error::from(e));
                }
            }
        } else {
            // comsume the reader to skip the file, even if we don't need it
            while let Ok(n) = reader.read(&mut [0; 4096]) {
                if n == 0 {
                    break;
                }
            }
            Ok(true)
        }
    });
    assert!(!buf.is_empty())
}

Revertron · 2023-08-12T14:23:58Z

Thanks for quick response!
This works, but it is very slow, even if I make buffer 2Mb and move it from closure and reuse it.

Is there something to make it faster? :(

Revertron · 2023-08-13T11:59:11Z

Gone through the code of reader, and I think we need to change all those R: Read to Read + Seek, and then just skip unread bytes.
But there is a problem with multiple traits: https://doc.rust-lang.org/error_codes/E0225.html
So, we need to create a different trait like this:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=66c772a420cb50c0fa78ab3d91bda052

dyz1990 · 2023-08-14T01:04:06Z

@Revertron Because the data to be decompressed depends on the data in front of it, you cannot simply skip the previous data and only decompress the data in the back. This is why the reader does not implement the Seek trait.

Revertron · 2023-08-14T12:31:18Z

But the 7zip app is definitely skipping all blocks before the block of extracting file. Is it possible to implement this?

dyz1990 · 2023-08-15T05:33:52Z

But the 7zip app is definitely skipping all blocks before the block of extracting file. Is it possible to implement this?

It's not easy, I'll give it a try

dyz1990 · 2023-08-15T09:05:06Z

@Revertron I noticed that the file "Memes.7z" contains more than one solid stream. So you can speed up decompression by skipping streams that don't contain required files.

you can check this example forder_dec.rs.
And this example mt_decompress.rs if you want use multi-thread.

pavpen · 2023-11-28T14:58:20Z

I think you should, at least, document this issue in the description of for_each_entries, and related functions. I spent a day debugging my code to end up here.

dyz1990 · 2023-12-05T00:50:30Z

@pavpen Sorry about that. I'll add documentation for the method.

dyz1990 added the enhancement New feature or request label Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChecksumVerificationFailed on read of many files in solid archive #31

ChecksumVerificationFailed on read of many files in solid archive #31

Revertron commented Aug 12, 2023 •

edited

dyz1990 commented Aug 12, 2023

Revertron commented Aug 12, 2023

Revertron commented Aug 13, 2023

dyz1990 commented Aug 14, 2023 •

edited

Revertron commented Aug 14, 2023

dyz1990 commented Aug 15, 2023

dyz1990 commented Aug 15, 2023

pavpen commented Nov 28, 2023

dyz1990 commented Dec 5, 2023

ChecksumVerificationFailed on read of many files in solid archive #31

ChecksumVerificationFailed on read of many files in solid archive #31

Comments

Revertron commented Aug 12, 2023 • edited

dyz1990 commented Aug 12, 2023

Revertron commented Aug 12, 2023

Revertron commented Aug 13, 2023

dyz1990 commented Aug 14, 2023 • edited

Revertron commented Aug 14, 2023

dyz1990 commented Aug 15, 2023

dyz1990 commented Aug 15, 2023

pavpen commented Nov 28, 2023

dyz1990 commented Dec 5, 2023

Revertron commented Aug 12, 2023 •

edited

dyz1990 commented Aug 14, 2023 •

edited