Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for converting existing tar.xz archive to indexed pixz file? #98

Open
satmandu opened this issue Jul 15, 2021 · 1 comment
Open

Comments

@satmandu
Copy link

just running pixz on an existing tar file creates a tpxz file, but when I then do an extract with a piped file, I get a warning about seeking, which I presume implied that an index was not created.

E.g.

cd /tmp
curl -OLf https://gitlab.com/api/v4/projects/26210301/packages/generic/glibc/2.27_x86_64/glibc-2.27-chromeos-x86_64.tar.xz
xz -d glibc-2.27-chromeos-x86_64.tar.xz
pixz -9 glibc-2.27-chromeos-x86_64.tar 
curl -Ls file:///tmp/glibc-2.27-chromeos-x86_64.tpxz  |  pixz -x usr/local/lib64/Scrt1.o usr/local/lib64/crti.o usr/local/lib64/crtn.o | tar x
can not seek in input: Illegal seek

(Here I get everything extracted, instead of just the files I want. The goal is to later just extract the files directly piped from a curl download.)

@stephane-chazelas
Copy link

To recompress with pixz, you'd do:

pixz -d < glibc-2.27-chromeos-x86_64.tar.xz | pixz -9 > glibc-2.27-chromeos-x86_64.tpxz

Storing the intermediate uncompressed version is unnecessary and wasterful.

pixz allows you to extract individual elements without uncompressing the archive from the start by storing an index at the end of the archive with information as to where to look for them in the compressed file.

It can only do that if it can first seed to the end of the file to get to that index, and then seek back within the archive to extract those elements wherever they're found. Which is why it can only do that when the file is a seekable file.

In curl... | pixz -x member, pixz's input is a pipe. pipes are not seekable, you can't skip to end and seek back on those.

You'd need:

pixz < glibc-2.27-chromeos-x86_64.tpxz -x member

Or

pixz -i glibc-2.27-chromeos-x86_64.tpxz -x member

For the input to be the seekable file in the current directory.

I'd agree the man page could be clarified.

The fact that pixz extracts the entire archive when asked to extract only some members when the input is non-seekable is also counterintuitive.

The rationale for pixz to do that might be that if you do it as pixz -x member | tar xf - member as opposed to just tar xf -, that still lets you extract the member even if the input is not seekable (though less efficiently in that case). That's inconsistent though with the behaviour when trying to extract members of a tar.xz archive without index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants