Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata for recursive compressed TARs into database of outer TAR #79

Open
mxmlnkn opened this issue Mar 19, 2022 · 0 comments
Open
Labels
performance Something is slower than it could be

Comments

@mxmlnkn
Copy link
Owner

mxmlnkn commented Mar 19, 2022

Since the very first version of ratarmount, it was able to work with recursive archives. But only with uncompressed TARs inside TARs inside TARs because this case is easy as the same outer metadata database can be reused to simply store the name and position. If it is a file inside a TAR inside a TAR, then it does not change the format much, the position only is shifted by 512B because of the metadata of the TAR archive itself but everything is the same. The files inside the recursive TAR even have the same 512B alignment.

For recursive compressed TARs, this becomes hard to do because of the compression layers inbetween and the implementation. Uncompressed recursive TARs are simply resolved by the SQLiteIndexedTar class itself but to mount nested compressed archives, the AutoMountLayer class simply opens a new and completely independent SQLiteIndexedTar class, which uses a separate index file.

The problem with the index files for the nested folders is that their save location is unclear and possibly temporary, i.e., in memory. This means, they will have to be recreated in a possibly costly manner on each new mount. An alternative might be to store all of them as separate archives besides the outermost archive but that might become unwieldy. It would be nice if all of the nested metadata information could be appended as separate SQLite tables into the outermost index file itself.

The table names could be somehow cleaned paths or hashes or there could be a separate lookup table to link nested paths to nested metadata tables named by with a simply incrementing suffix. The outermost archive's table should be kept unchanged to keep the index format downwards compatible.

This would only be relevant for hierarchies like this:

archive.tar
    nested1.tar.bz2
    nested2.tar.gz
    nested3.xz
    ...

Note that it is not (yet) relevant for rar and zip because they have built-in collected metadata and therefore are directly used from ratarmount without any performance optimizations. They are basically just a gimmick ;) but still work and are surprisingly useful at least to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Something is slower than it could be
Projects
None yet
Development

No branches or pull requests

1 participant