Adds method to allow for extracting data at a given offset. #275

bitwisejb · 2023-04-28T03:28:11Z

Fixes #274

Changes proposed in this PR

Adds a method that allows the user to extract data given an entry, size, and offset.

Tests performed

Added unit tests to verity the change against a standard and a zip64 compressed archive.
Verified existing tests work.
Ran SwiftLint and saw no issues.

weichsel · 2023-05-09T06:46:41Z

Hi bitwisejb,
Thanks for providing this PR.
Can you explain your use case for this addition? It seems like you want to achieve random access into an archive file based on an entry starting position and some arbitrary offset.
The focus of ZIP Foundation is to provide a structured way to access content of an archive on an per-entry basis - abstracting away the internals (offsets, lengths, compression, ...) of ZIP files.
While your addition makes use of some metadata (e.g. the beginning of the entry data offset), it mainly performs low level seek/file access that could be achieved without using ZIP Foundation. API users that call into your new extract method would get back a blob of data without any context. Reading chunks of compressed entries that way wouldn't make much sense since they'd be impossible to decompress at the call site.

Would it help for your usage scenario to expose e.g. Entry.dataOffset?

bitwisejb · 2023-05-12T14:19:39Z

I believe making Entry.dataOffset public would work for my use case. The archive I am working with has map imagery stored in a folder structure designating levels and tile positions. One entry in the archive is an index for locating tile image data for a given xyz. We use xyz to determine the entry and offset for the image to extract. Byte count is known to us based on information in the index entry.

bitwisejb · 2023-05-13T03:39:36Z

@weichsel It looks like we would need the fileHandle for the archive. Would exposing Archive.archiveFile be an option as well?

bitwisejb · 2023-11-09T21:29:16Z

@weichsel We need to be able extract a single entry from a Zip file with a compression level of 0 without extracting the entire archive. The method that was originally put in place enables this functionality. We appreciate the thought and design that you have put in place that hides the lower level details. You had mentioned above that there may be a way to accomplish this without this change. What would be a good approach for accomplishing this, or is there a change you would recommend that could introduce this functionality?

weichsel · 2023-11-09T21:44:26Z

@bitwisejb

We need to be able extract a single entry from a Zip file with a compression level of 0 without extracting the entire archive.

You can subscript into an archive via path: https://github.com/weichsel/ZIPFoundation#accessing-individual-entries.
This will provide you access to an entry without having to extract the whole archive first.

You had mentioned above that there may be a way to accomplish this without this change. What would be a good approach for accomplishing this, ...

After retrieving the entry, you can use the closure-based Archive.extract method: https://github.com/weichsel/ZIPFoundation#closure-based-reading-and-writing
This will allow you to perform chunk-wise reads on the contents of your entry. The sample code in the README uses the basic version of this method. Please refer to the docs for more info. e.g. there's a bufferSize parameter that allows you to control the size of the data chunks passed into the closure.

bitwisejb · 2023-12-20T22:42:56Z

@weichsel We have investigated this API in the past, but it is inefficient for extracting a known set of bytes from a zip file that may be several gigabytes. Our use case requires high volume random access to well known files (offset and size) within the zip file without additional overhead. Perhaps there is another more performant api that exists that I am not aware of.

We have been using a fork of this repo with the included functions for some time with great success. We wish to contribute this back to this repo and change to using this repo so that we may benefit from any future contributions.

Please advise on what we can do to move this change forward? Otherwise, we will be left working with our fork.

Adds method to allow for extracting data at a given offset.

af1746f

bitwisejb closed this Nov 9, 2023

bitwisejb reopened this Nov 9, 2023

weichsel closed this Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds method to allow for extracting data at a given offset. #275

Adds method to allow for extracting data at a given offset. #275

bitwisejb commented Apr 28, 2023

weichsel commented May 9, 2023

bitwisejb commented May 12, 2023

bitwisejb commented May 13, 2023 •

edited

bitwisejb commented Nov 9, 2023

weichsel commented Nov 9, 2023

bitwisejb commented Dec 20, 2023

Adds method to allow for extracting data at a given offset. #275

Adds method to allow for extracting data at a given offset. #275

Conversation

bitwisejb commented Apr 28, 2023

Changes proposed in this PR

Tests performed

weichsel commented May 9, 2023

bitwisejb commented May 12, 2023

bitwisejb commented May 13, 2023 • edited

bitwisejb commented Nov 9, 2023

weichsel commented Nov 9, 2023

bitwisejb commented Dec 20, 2023

bitwisejb commented May 13, 2023 •

edited