Releases: fcorbelli/zpaqfranz
Fixed
Windows 32/64 binary, 64 bit-HW accelerated
New hash command
There is a new hash command, a simplified sum
The sum "old" command is designed to quickly locate duplicate files (and do many other things).
However, it processes ALL the files and only after finishing write the output.
This can be undesirable behavior, that is, you may prefer to have the data as it is computed, and with an alphabetical sorting
Examples
SHA1 of all files: hash z:\knb
SHA1 of all files, multithread: hash z:\knb -ssd
XXH3 multithreaded: hash z:\knb -ssd -xxh3
SHA256 stored to file: hash z:\knb -ssd -sha256 -stdout -out 1.txt
Without -ssd (aka: monothread) the hashes are written as soon as possible
C:\zpaqfranz\release\59_5>zpaqfranz hash * -xxh3
zpaqfranz v59.5h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-14)
franz:hash 9 - command
franz:-xxh3 -hw
Hashing XXH3 ignoring .zfs and :$DATA
6f9035a81441334b2dfc812fa675741c zpaqfranz.cpp
fdffed33ceed9450459a64a45b73533c zpaqfranz.exe
7fa3ef824610ffb5e635e1eef138b635 zpaqfranz32.exe
ca37708d208a851cedd6b40daa904d1f zpaqfranzhw.exe
0.032 seconds (00:00:00) (all OK)
The -stdout "clean up" the output
C:\zpaqfranz\release\59_5>zpaqfranz hash * -xxh3 -ssd -stdout
6f9035a81441334b2dfc812fa675741c zpaqfranz.cpp
fdffed33ceed9450459a64a45b73533c zpaqfranz.exe
7fa3ef824610ffb5e635e1eef138b635 zpaqfranz32.exe
ca37708d208a851cedd6b40daa904d1f zpaqfranzhw.exe
Command s with -home switch
Calculation of total folder size from depth 1
Translation
A very common situation is to determine the size of the /home (or /users) or perhaps a virtual machine store
Of course there are a thousand different ways to do this, but still it requires a lot of effort
Here the -home come to the rescue
C:\zpaqfranz\release\59_5>zpaqfranz s c:\users -home -ignore
zpaqfranz v59.5h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-14)
franz:-home -hw -ignore
homesize
Scanning 5 subfolders...
----------------------------------------------------------------------------------------------------
2.461.000.730 00005693 c:/users/All Users/
0 00000000 c:/users/Default User/
1.568.227 00000103 c:/users/Default/
929.689.685 00008663 c:/users/Public/
35.207.884.218 00075389 c:/users/utente/
0.625 seconds (00:00:00) (all OK)
Or
C:\zpaqfranz\release\59_5>zpaqfranz s k:\vm -home
zpaqfranz v59.5h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-14)
franz:-home -hw
homesize
Scanning 40 subfolders...
----------------------------------------------------------------------------------------------------
4.956.190.019 00000011 k:/vm/arc/
10.189.045.679 00000018 k:/vm/bsd132/
5.547.460.994 00000008 k:/vm/centos8/
19.201.351.966 00000016 k:/vm/debian11/
15.272.337.785 00000016 k:/vm/debian_zpaq/
6.293.968.747 00000019 k:/vm/debpdc/
382.132.482 00000009 k:/vm/esxi65/
12.738.841.135 00000021 k:/vm/esxi_static/
4.321.475.703 00000017 k:/vm/esxmanager/
7.909.408.806 00000014 k:/vm/fedora25_32/
11.372.514.545 00000015 k:/vm/fedora34/
5.572.980.650 00000011 k:/vm/fedora64_32bitcompiler/
7.725.149.892 00000011 k:/vm/fedorapdc/
10.288.774.907 00000014 k:/vm/haiku/
49.335.277.197 00000229 k:/vm/ltzc/
28.541.177.353 00000014 k:/vm/lubuntu/
40.400.088.051 00000016 k:/vm/manjaro/
11.639.660.189 00000011 k:/vm/mx23/
19.976.497.088 00000014 k:/vm/nextcloud/
1.656.662.061 00000013 k:/vm/omnios/
2.094.612.004 00000013 k:/vm/openbsd71/
3.334.662.184 00000007 k:/vm/opensolaris/
3.555.826.244 00000014 k:/vm/opensuse/
2.830.315 00000007 k:/vm/os2/
(...)
Command hash with -orderby
Just a little evolution
C:\zpaqfranz\release\59_5>zpaqfranz sum * -xxh3 -orderby size -desc
zpaqfranz v59.5h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-14)
franz:sum 1 - command
franz:-orderby <<size>>
franz:-xxh3 -desc -hw
Getting XXH3 ignoring .zfs and :$DATA
No multithread: Found (12.84 MB) => 13.468.890 bytes (12.84 MB) / 4 files in 0.016000
|XXH3: 6F9035A81441334B2DFC812FA675741C [ 3.600.090] |zpaqfranz.cpp
|XXH3: FDFFED33CEED9450459A64A45B73533C [ 3.454.464] |zpaqfranz.exe
|XXH3: CA37708D208A851CEDD6B40DAA904D1F [ 3.228.160] |zpaqfranzhw.exe
|XXH3: 7FA3EF824610FFB5E635E1EEF138B635 [ 3.186.176] |zpaqfranz32.exe
0.032 seconds (00:00:00) (all OK)
-ignore
Suppress much of the errors displayed due to problems (e.g., missing rights).
You can scan folders like c:\users without flooding the console with errors.
Of course, it is risky
C:\zpaqfranz\release\59_5>zpaqfranz hash c:\users -ssd
zpaqfranz v59.5h-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-05-14)
franz:hash 9 - command
franz:-hw -ssd
Hashing SHA-1 ignoring .zfs and :$DATA
error kind 32 ERROR_SHARING_VIOLATION opening <<c:/Users/utente/AppData/Local/Dropbox/QuitReports/002b211c-7d33-4799-8bc4-2f218c2b0152.dbt>>
error kind 32 ERROR_SHARING_VIOLATION opening <<c:/Users/utente/AppData/Local/Dropbox/instance1/sync/temp/6831b1ee7d30f187error kind 32 ERROR_SHARING_VIOLATION opening <<c:/Users/utente/AppData/Local/Dropbox/logs/1/1-94cf-664312a4.tmp>>
error kind 32 ERROR_SHARING_VIOLATION opening <<c:/Users/utente/AppData/Local/Dropbox/instance1/sync/temp/49fb0e1811f7c156error kind 32 ERROR_SHARING_VIOLATION opening <<c:/Users/utente/AppData/Local/Dropbox/ssa_events/store>>
error kind 32 ERROR_SHARING_VIOLATION opening <<c:/Users/utente/AppData/Local/Dropbox/instance1/sync/temp/e863aaf99a8c2d99>>
(...)
vs
C:\zpaqfranz\release\59_5>zpaqfranz hash c:\users -ignore
different timetohuman 00:00:00
Removed a "0" in time, if not necessary (!)
substituted /sec with /s
In the speed information removed the /sec for /s. They were very useful for debugging, but whatever 😄
minor fixes in -noeta
fixes in update (on *nix)
It should now give clearer information (!) about the availability of updates from non-Windows systems
root@nsz:/zp# zpaqfranz update
zpaqfranz v59.5h-JIT-L,HW SHA1/2,(2024-05-14)
Checking internet update (-verbose for details)
Testing internet version...
Your 59.5h (2024-05-14) is not older of 59.5h (2024-05-14) => nothing to do
0.065 seconds (00:00:00) (all OK)
Windows 32/64 binary, 64 bit-HW accelerated
crop command
This new "crop" or "drop" will delete the latest version(s) from a (non multipart) archive
Have you ever wanted to delete an accidentally added version, perhaps a very large one?
Now you can!
By default DRY RUN (only test)
-kill Do a 'wet' (effective) run
-to tiny.zpaq Reduce to tiny.zpaq (safer)
-until X Discard every versions >X
-maxsize X Manually cut at X (RISKY)
-force Crop in-place (no backup: VERY RISKY!)
Examples:
Reduce file (dry run, just infos): crop z:\1.zpaq
Reduce up to version 100: crop z:\1.zpaq -to d:\2.zpaq -until 100 -kill
Reduce to first 100.000: crop z:\1.zpaq -to d:\2.zpaq -maxsize 100k -kill
Crop in place (NO BACKUP! RISKY!): crop z:\1.zpaq -until 2 -kill -force
range in list
It is now possible to see the files added in a single, or multiple versions
What files were changed yesterday?
-range X:Y Range versions [X-Y]
-range X: Range versions [X-THE LAST]
-range :X Range versions [1-X]
-range X Range is single version X
-range ::X Last X versions
Examples:
Files added in versions 2 and 3 l z:\1.zpaq -range 2:3
Files added in last version l z:\1.zpaq -range ::1
-sfx flag (win32)
Directly create a SFX archive
*zpaqfranz a z:\test.zpaq .cpp -sfx
c command -checksum switch
Changed -verify to -checksum to avoid switch collisions
Flag -nomac
Skip Mac's .DS_Store, Thumbs.db (yep, sometimes even Thumbs.db!), .something
Why Mac sometimes fills NAS Samba shares with tens of thousands of useless files?
./ autoadd in backup
Minor fix for backup command on *nix
VSS filename fix (win32)
Automagically rename VSS files
Cortex franzomips
QNAP cheap NAS CPU benchmark
-verbose in dump now shows blocks offset (from file start)
Windows 32/64 binary, 64 bit-HW accelerated
New update command
There is a new command(s), update/upgrade, to check for newer zpaqfranz versions (every platform), download and update (for Win64)
This new command will download (by default from my http://www.francocorbelli.it/zpaqfranz) the latest info file
So you can easily see whether you are using the latest version, or an older one
Check for updates (works even on *nix)
zpaqfranz update
On Win64: update if newer executable
zpaqfranz update -force
On Win64: get always from Internet
zpaqfranz update -force -kill
On Win64: get from another site (RISKY)
zpaqfranz update https://www.pippo.com/ugo.sha256 http://www.pluto.com/zpaqnew.exe
TRANSLATION
If you usually use zpaqfranz on Win64 and you want to upgrade it, you give the command update -force. That's all
Warning. Downloading executable programs from the Internet is potentially dangerous. Always choose a reliable source (such as github or sourceforce), or directly my site https://www.francocorbelli.it/zpaqfranz/win64/
New command download (Win64 only, for now)
Download a file, just like wget. Can use a textfile with a MD5/SHA-1/SHA-256 hash
By default DO NOT overwrite (use -force), by default DO check output path (use -space to bypass).
Download file 2.cpp into local 2.cpp.
LOOK AT ./2.cpp It is NOT 2.cpp, it is ./2.cpp !
zpaqfranz download https://www.1.it/2.cpp ./2.cpp
Download and check from a SHA-256 text file
zpaqfranz download http://www.1.it/3.cpp z:\3.cpp -checktxt http://www.1.it/3.sha256
zpaqfranz will automagically detect the type of hash from hash len.
32=MD5
40=SHA-1
64=SHA-256
Download the latest zpaqfranz.exe (Win64) from my site, store into ./thenewfile.exe, test the MD5 hash
zpaqfranz download http://www.francocorbelli.it/zpaqfranz/win64/zpaqfranz.exe ./thenewfile.exe -checktxt http://www.francocorbelli.it/zpaqfranz/win64/zpaqfranz.md5
In the example this will translate into a MD5 (because it is 32-bytes long)
4fc826048e5a66969f468b57deea7b4b zpaqfranz.exe
Improved ads command
Now, on Windows, can list-strip (remove) or rebuild ADS (alternate data stream) information
It is still being refined and developed, there will be improvements
Show ADS
zpaqfranz ads z:\1.zpaq
Remove ADS (all of them)
zpaqfranz ads z:\*.zpaq -kill
Remove just one ADS
zpaqfranz ads z:\*.zpaq -only fasttxt -kill
Rebuild ADS "everything"
zpaqfranz ads z:\1.zpaq -force
The fasttxt switch works with ADS
Translation. It is possible to compute the updated CRC-32 of an archive, stored in ADS file stream
zpaqfranz a z:\pippo.zpaq c:\dropbox -fasttxt -ads -key pippo
Now you can
zpaqfranz versum z:\pippo.zpaq
Using this ploy it is possible to check the integrity of an entire archive (i.e., not of the files in it, but of the entire archive itself, WITHOUT the password (if any)!
It also processes the test at virtually the maximum speed of the mass medium (over 2GB/s for NVMe, and even more)
This is critical to quickly check clients' backups WITHOUT knowing their passwords.
Integration with Samba and PAKKA support is in development.
Pause command can ask for a key to be pressed
Wait for key z
zpaqfranz pause -find z
Better support for Windows 7 64 bit
Windows 7 is a pain in the ass, I hope to have fixed some problems that used to happen
Updated benchmark franzomips list
Added the 7950X3D, my new CPU
zpaqfranz b
zpaqfranz v59.3q-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-04-19)
franz:-hw
uname WIN64
full exename seems <<C:/zpaqfranz/zpaqfranz.exe>>
Free RAM seems 41.274.781.696
SHA1/2 seems supported by CPU
HW disabled, because franzomips. Choose one to keep (ex -sha256)
Benchmarks: XXHASH64 XXH3 SHA-1 SHA-256 BLAKE3 CRC-32 CRC-32C WYHASH WHIRLPOOL MD5 SHA-3 NILSIMSA HIGHWAY64
Time limit 5 s (-n X)
Chunks of 390.62 KB (-minsize Y)
00000005 s XXHASH64: speed ( 6.84 GB/s)
00000005 s XXH3: speed ( 7.37 GB/s)
00000005 s SHA-1: speed ( 982.67 MB/s)
00000005 s SHA-256: speed ( 453.95 MB/s)
00000005 s BLAKE3: speed ( 4.45 GB/s)
00000005 s CRC-32: speed ( 10.18 GB/s)
00000005 s CRC-32C: speed ( 8.18 GB/s)
00000005 s WYHASH: speed ( 10.18 GB/s)
00000005 s WHIRLPOOL: speed ( 229.49 MB/s)
00000005 s MD5: speed ( 944.98 MB/s)
00000005 s SHA-3: speed ( 545.20 MB/s)
00000005 s NILSIMSA: speed ( 10.17 GB/s)
00000005 s HIGHWAY64: speed ( 1.78 GB/s)
Results:
WHIRLPOOL: 229.49 MB/s (done 1.12 GB)
SHA-256: 453.95 MB/s (done 2.21 GB)
SHA-3: 545.20 MB/s (done 2.66 GB)
MD5: 944.98 MB/s (done 4.61 GB)
SHA-1: 982.67 MB/s (done 4.80 GB)
HIGHWAY64: 1.78 GB/s (done 8.92 GB)
BLAKE3: 4.45 GB/s (done 22.26 GB)
XXHASH64: 6.84 GB/s (done 34.20 GB)
XXH3: 7.37 GB/s (done 36.75 GB)
CRC-32C: 8.18 GB/s (done 40.76 GB)
NILSIMSA: 10.17 GB/s (done 50.71 GB)
CRC-32: 10.18 GB/s (done 50.73 GB)
WYHASH: 10.18 GB/s (done 50.75 GB)
franzomips single thread index 5.467 (quick CPU check, raw 5.467)
Atom N2800 (phy) 4 1343.31 %
Xeon E3 1245 V2 (vir) 4 226.39 %
Celeron N5105 (phy) 4 300.73 %
i5-6200U (phy) 2 287.30 %
Xeon E5 2620 V4 (phy) 8 295.85 %
Xeon E5 2630 V4 (phy) 10 352.27 %
Xeon D-1541 (vir) 8 269.19 %
i5-3570 (phy) 4 184.96 %
i7-4790K (phy) 4 167.45 %
i7-8700K (phy) 6 162.81 %
AMD-Ryzen 7 3700X(phy) 8 167.91 %
i9-9900K (phy) 8 142.71 %
i9-10900 (phy) 10 147.56 %
AMD-5950X (phy) 16 113.95 %
i9-12900KS 56400 (phy) 16 101.19 %
AMD-7950X3D (phy) 16 99.80 %
65.094 seconds (000:01:05) (all OK)
Please remember that HW-acceleration is disabled in franzomips, you can check like this
C:\zpaqfranz>zpaqfranz b -sha256
zpaqfranz v59.3q-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-04-19)
franz:-sha256 -hw
uname WIN64
full exename seems <<C:/zpaqfranz/zpaqfranz.exe>>
Free RAM seems 41.258.627.072
SHA1/2 seems supported by CPU
Benchmarks: SHA-256
Time limit 5 s (-n X)
Chunks of 390.62 KB (-minsize Y)
00000005 s SHA-256: speed ( 2.02 GB/s)
Results:
SHA-256: 2.02 GB/s (done 10.09 GB)
Windows 32/64 binary, 64 bit-HW accelerated
New command pakka, for... PAKKA
As everyone knows, or maybe not :) I wrote a Windows GUI for zpaqfranz about 10 years ago
Here it is https://www.francocorbelli.it/pakka/build/latest/pakka_latest.zip
Newer zpaqfranz does support newer PAKKA, without additional zpaqlist [special lister]
TRANSLATION
On Windows you can use a freeware software, PAKKA, 'getting from my site
PAKKA does have an autoupdater (from Internet), therefore it is very easy to get the very last version
In my spare time I am implementing the various functions of adding files (tests, and about all the functions), not just only extracting
Mainly missing is the online help; I am working on using ADS on small NAS, especially TrueNAS.
Stay tuned.
Being a Delphi program I can evolve it MUCH faster than zpaqfranz
How it works?
List to file zpaqfranz pakka h:\zarc\1.zpaq -out z:\default.txt
Disable de-duplicator zpaqfranz pakka h:\zarc\1.zpaq -all -distinct -out z:\default.txt
Get version 10 zpaqfranz pakka h:\zarc\1.zpaq -until 10 -out z:\10.txt
Bugfixes
Some minor "things"
Windows 32/64 binary, 64 bit-HW accelerated
This new build is of a new branch, includes several new features to be tested, and may therefore contain bugs.
In this case, please report, and I'll fix ASAP
-chunk
After (quite a lot) of digging here the fixed-chunk version, a long overdue improvement
Of course there are a lot of complexity to be taken in account (aka: nothing is easy with zpaq)
zpaq does not allow to create archives of a fixed size: this implies that, generally, the first file is gigantic, and subsequent ones much smaller.
This prevented splitting on optical media (e.g., Blue Ray), and this is bad (incidentally that's "where" I'm going to use it).
The zpaq's operating logic doesn't really provide multiple output files, but now (maybe) it does 😄
Operation, which is still not fully integrated (for example, it is not supported in the backup command), is easy to activate. It is like a normal multipart archive, but with a switch that indicates the maximum size of the pieces
Running with encrypted archives has been difficult, and it is still not 101% tested
zpaqfranz a z:\ronko_?? whatever-you-like -chunk 1G
zpaqfranz a z:\ronko_?? who-knows -chunk 500m
The -chunk gets straight number (2000000), K,M,G (100M), KB,MB,GB (500MB)
The chunks are not guaranteed to be 100% exact, typically should be multiple of 64KB
The created multipart archive SHOULD be fully backward compatible, even with 7.15 (this takes a lot of efforts!)
the "real" problem is... Control-C
In other words, abrupt termination is not trivial to handle, because it is necessary to delete created but unfinished pieces, by identifying file handles in order to close them beforehand (otherwise the deletion will fail).
Another problem to be solved is the a priori estimation of the number of parts needed, compared to the number of parts indicated (in other words, the number of ? in the file name). Next release
ADS filelists (alternate data stream)
Running on Windows' NTFS, for unencrypted archives, it is possible to store the file list in the Alternate Data Stream
this way
zpaqfranz a z:\1.zpaq *.cpp -ads
Now the list command will take data from the ADS
zpaqfranz l z:\1.zpaq
To enforce "standard" way, use -ads in list
zpaqfranz l z:\1.zpaq -ads
Not very refined, a bit of test is necessary
After (quite a lot) of digging I decided to use LZ4 to compress the ADS stream.
It reduce the file size not much, but is extremely fast during the extraction phase and, even better, I was able to do it 'in chunks', making the size of the file list essentially limitless, "hacking" one of LZ4's examples.
And yes, it is the first piece to have zpaq archives "mountable" as a filesystem (in the distant future)
As you know, or maybe not, when you ask the list of the files inside an archive every transactions (version) is readed from the start of the file
This can become slow over time (big archive with millions files, thousands versions)
TRANSLATION
Suppose you have
File1
File2
File3
Inside version 1
And
File4
File5
File6
File7
Inside version 2
(...)
File7000000
File7000001
File7000002
Inside version 1000
If you ask the list first you have to decode version1, then version2 then version...1000
zpaqfranz already allow to store those lists inside the archive (-filelist)
But if you ask for a list in the 1000th version, a lot of (slow) work is needed too
Manipulating ADS by new ads command
There is a new command to get ads (on Windows, of course)
Show ADS filelist ads z:\1.zpaq
Rebuild ADS filelist ads z:\1.zpaq -force
Remove ADS filelist ads z:\*.zpaq -kill
-fast: store inside zpaq archive a "kind of" file list (to be developed)
This is an experimental function for future use, which is extremely complex to make work properly
Basically, it involves hacking the original zpaq format to hold additional data, while maintaining backward compatibility. It's a gigantic job, because zpaq 7.15 is definitely "choosy" about the format of the files it processes
For the 'curious' I give the recap
The aim is to extract a file from a zpaq archive without having to read the zpaq file itself (i.e. without reconstructing its contents as zpaq normally does, from begin to end, skipping over d-ata blocks)
For gigantic files this can be slow, even taking minutes. Using the 'trick' of storing the necessary information in the LAST stored file as a 'link', we essentially read the last 'dummy' file and from that go directly to the data
Unfortunately, I was not able to eliminate the fragmentation altogether (without breaking backward compatibility), which would have made it possible to use LZ4 instead
On Windows it is much easier and faster to use ADSs (alternate data streams) but on *nix they obviously do not exist, so this is "why"_
zpaqfranz a z:\1.zpaq c:\dropbox -fast
When listing, if fast founded, decode / else enforce standard (opposite of -ads )
zpaqfranz l z:\1.zpaq
zpaqfranz l z:\1.zpaq -fast
As mentioned it is an area that currently contains essentially useless information
Getting password from console is changed, now should be run... everywhere
zpaq 7.15, when encounters an encrypted file and does not know the password (not specified with -key something) aborts. zpaqfranz instead asks the user from the keyboard, which is much safer (it doesn't end up in the command interpreter's history). This used to happen separately for each function (and was long-tested). Now instead it happens at the centralized level of the class that manipulates the incoming files. This is "risky" for new bugs, however it does make all commands inherit this feature (example dump)
Colors on Windows
There is now (limited) color support on Windows console, IF THE BACKGROUND IS BLACK (as it is known, or perhaps not, the console background historically can get 8 colors (a combination of single-bit R, G and B) with an additional level of intensity. But the new Windows' versions use 32-bit palettes, making it really hard to tell what the actual background color is. Windows for example "say" magenta, but... it is blue (!) That's why I decided to make it work only if black).
zpaqfranz also supports the NO_COLOR environment variable; if it is set (or the -nocolor switch is used) it will revert to normal mode
Colors on non-Windows? Maybe in the future. There are so many interoperability issues that I'm pretty skeptical about getting it to work on every system. We'll see
Smarter debug
There are now 4 switches for debugging (!)
-debug, -debug2 and -debug3 show more information as they go.
-debug4 activates a special mode, in which it writes debug files inside the z:\ drive (if any)
-longpath files
Support on Windows of paths longer than 255 characters is extremely complicated.
There is no "serious" function in the Windows API to do something as trivial as locating the real name of a file. The "best" function, which works in almost every case, involves using COM, and I don't want to do that (too slow and too high a risk of leakage when used for thousands and thousands of calls, Scripting.FileSystemObject with GetAbsolutePathName)
Now I take from KERNEL32.DLL
getFinalPathNameByHandleW=(GetFinalPathNameByHandleW_t)GetProcAddress(h, "GetFinalPathNam...)
The -longpath switch in zpaqfranz is designed to operate on PATHS, not FILE
At the request of a user, I have also included rudimentary support for individual files
Here the "spiegone"
#90
NEW BUILD = NEW BUGS
As usual, the more issue reports, the better the program becomes. The right place is
https://github.com/fcorbelli/zpaqfranz/issues
Thank you for any help in improving the product
In extenso links
https://encode.su/threads/4182-Color-or-not
https://encode.su/threads/4178-hacking-zpaq-file-format-(!)
https://encode.su/threads/4168-Virus-like-data-storage-(!)
Windows 32/64 binary, 64 bit-HW accelerated
Faster overall
~5% in the average case, due to "smarter" computation of SHA-1 on fragments
New switch -dataset on zfs - unix
This will use zfs filesystem to automagically update files without filesystem scans
zpaqfranz a /tmp/test.zpaq * -dataset "tank/d"
Using point-in-time copy mechanisms (e.g., once every hour) requires scanning the entire filesystem.
zpaqfranz has long supported the ZFS backup feature, but it's at the block level, not at single file level
aka: you can very quickly backup "everything", but to restore "something" (a file) you have to ... restore everything, then get back the file you want
In the case of using large fileservers or with magnetic disks, i.e. on which the filesystem scan is slow, the issue becomes "painful", whatever software you use (tar, 7z, srep or whatever you want)
TRANSLATION
Suppose you have a mid-sized file server, with 1M files
Suppose your system can scan the folders at 500 files/sec (real-world performance for spinning drives), you need AT LEAST (~) 30 minutes (1M/(500*60)) just to enumerate everything
THEN "you" (whatever software you use) can start to "do things" (aka: deduplicate, compress, whatever)
With SSD real world speed is ~ 5K files/sec, with NVMes ~ 30K files/sec
=>
you cannot update the backup (in the example) every 10 minutes
But, with zpaqfranz on zfs, now you can
the -dataset automagically will make a temporary snapshot
On the next run will get changed files from the zfs filesystem, instead of scanning again from scratch
First run, nothing done
In this example the dataset is tank/d.
Datasets are (very crudely) parts of a "disk" (I'm actually obscuring the whole ZFS hierarchy), basically... a folder where you write the data (https://www.illumos.org/books/zfs-ad...r-1.html#ftyue)
root@aserver:/tmp/zp # zpaqfranz a prova2.zpaq * -dataset "tank/d" -verbose
zpaqfranz v58.12m-JIT-L(2023-12-02)
franz:-dataset <<tank/d>>
franz:-verbose
59901: zfs dataset tank/d
59839: dataset path |/tank/d/|
59840: topath |/tank/d/.zfs/snapshot/franco_diff/|
59856: Base snapshot tank/d@franco_base
59856: Temp snapshot tank/d@franco_diff
37720: running Destroy diff snapshot (if any)
38162: x_one zfs destroy tank/d@franco_diff
37720: running Taking diff snapshot
38162: x_one zfs snapshot tank/d@franco_diff
39147: running Getting diff
39149: x_one zfs diff -F tank/d@franco_base tank/d@franco_diff >/tmp/tempdiff.txt
59877: Load a zfsdiff 0 bytes long file <</tmp/tempdiff.txt>>
63108: zfsdiff lines 0
63119: + 0 - 0
59883: zfsdiff to add 0
59896: Nothing to do (from zfsdiff)
0.032 seconds (000:00:00) (with warnings)
Now create a newfile, somewhere in the dataset, and run again
with conventional "something" you have to enumerate all files, find the "touched" one, then "do something"
zpaqfranz will NOT enumerate all files, but take just the changed one(s), relying on the indication of the changes made by ZFS
In effect, it copies the data from the snapshot, therefore with certainty of consistency, even if it automagically changes its name (as if it were in the dataset, and not inside the snapshot). In short, it is transparent to the user
root@aserver:/tmp/zp # echo "test" >/tank/d/spaz/newfile
root@aserver:/tmp/zp # zpaqfranz a prova2.zpaq * -dataset "tank/d"
zpaqfranz v58.12m-JIT-L(2023-12-02)
franz:-dataset <<tank/d>>
59901: zfs dataset tank/d
59883: zfsdiff to add 1
Creating prova2.zpaq at offset 0 + 0
Add 2023-12-02 18:17:12 1 5 ( 5.00 B) 16T (0 dirs)
1 +added, 0 -removed.
0 + (5 -> 5 -> 840) = 840 @ 94.00 B/s
0.099 seconds (000:00:00) (all OK)
Now change again something, and run
root@aserver:/tmp/zp # echo "changed" >/tank/d/spaz/newfile
root@aserver:/tmp/zp # zpaqfranz a prova2.zpaq * -dataset "tank/d"
zpaqfranz v58.12m-JIT-L(2023-12-02)
franz:-dataset <<tank/d>>
59901: zfs dataset tank/d
could not find any snapshots to destroy; check snapshot names.
59883: zfsdiff to add 1
prova2.zpaq:
1 versions, 1 files, 840 bytes (840.00 B)
Updating prova2.zpaq at offset 840 + 0
Add 2023-12-02 18:17:55 1 8 ( 8.00 B) 16T (0 dirs)
1 +added, 0 -removed.
840 + (8 -> 8 -> 843) = 1.683 @ 195.00 B/s
0.086 seconds (000:00:00) (all OK)
In the archive the various version of the file(s) will be ready to a in-time file-level rollback
root@aserver:/tmp/zp # zpaqfranz l prova2.zpaq -all
zpaqfranz v58.12m-JIT-L(2023-12-02)
franz:-all 4
prova2.zpaq:
2 versions, 2 files, 1.683 bytes (1.64 KB)
- 2023-12-02 18:17:12 0 0001| +1 -0 -> 840
- 2023-12-02 18:17:08 5 0644 0001|/tank/d/spaz/newfile
- 2023-12-02 18:17:55 0 0002| +1 -0 -> 843
- 2023-12-02 18:17:48 8 0644 0002|/tank/d/spaz/newfile
48650: 13 (13.00 B) of 13 (13.00 B) in 4 files shown
48651: 1.683 compressed Ratio 129.462 <<prova2.zpaq>>
0.001 seconds (000:00:00) (all OK)
Obviously, the archiving time remains the same (if the changed files are very large, it will take the necessary time).
However, for fileservers used for e-mails, Word documents, etc., written by a few dozen users, the files are relatively small, and can be updated in a matter of seconds.
The real problem is to quickly locate what is the new file "foo.docx" written somewhere
Sure it's not a suitable method for giant virtual machine disks, but its goal is different
Default buffersize is now 1MB (was 4KB)
Time to update read-from-file for solid state World
New command redu
Quite complex command, developing of new "smarter" methods under the hood
zpaqfranz redu z:\*.exe
Fixed some (minor) issues on PowerPC (BIG endian)
Refactoring, removed unused code, a bit of trash stripped, smaller exe (on unix)
Minor bug fixed
This release is not very tested, be careful with valuable data
Windows 32/64 binary, 64 bit-HW accelerated
Big "news": developing (underway) to handle SHA-1 collisions
Disclaimer: is this a real issue? Can my backups become broken?
In fact, no. The SHA-1 files collision was created "in the lab" to prove its existence. In the "real world" I consider it extremely unlikely (aka: bordering on impossible) to have this kind of problem. So I believe I can say that it is safe to use zpaqfranz to make backups. After all, one of the first functions I implemented was precisely an enhancement to the t command, which (for years) does collision detection on zpaqfranz. Short version: if you want to be sure, use the t command (to test prepared archives); the new collision command does a faster test (than t), and is targeted only at collisions, not file integrity, which t tests instead. Finally, there are the paranoid commands and switches for people like me. The real problem is maintaining backward compatibility with zpaq 7.15. And, believe me, it is not easy at all.
Switch -collision in add
zpaqfranz can now recover from a SHA-1 collision in the current version (of the archive). AKA: if you are a bit paranoid, you can make sure that files with SHA-1 collision will be extracted correctly
Let's see a collision-aware zpaqfranz detecting a problem, and fixing
First of all: older zpaqfranz (default) says... nothing
release\58_10\zpaqfranz a z:\bydefaultnothing.zpaq message*
zpaqfranz v58.10o-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-10-01)
franz:-hw
Creating z:/bydefaultnothing.zpaq at offset 0 + 0
Add 2023-11-10 12:54:11 2 1.280 ( 1.25 KB) 32T (0 dirs)
2 +added, 0 -removed.
0 + (1.280 -> 640 -> 1.803) = 1.803 @ 40.32 KB/s
0.031 seconds (000:00:00) (all OK)
Older zpaqfranz, with -verify, early shows something is wrong
release\58_10\zpaqfranz a z:\doverify.zpaq message* -verify
zpaqfranz v58.10o-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-10-01)
franz:-hw -verify
Creating z:/doverify.zpaq at offset 0 + 0
Add 2023-11-10 12:54:55 2 1.280 ( 1.25 KB) 32T (0 dirs)
29604 SOMETHING WRONG ON messageA
GURU-C: on file messageA
GURU: CRC-32 from fragments 92433266
GURU: CRC-32 from file 072E2B0E
2 +added, 0 -removed.
(...)
OK, let's try the brand-new 58.11
zpaqfranz a z:\collision message* -collision
zpaqfranz v58.11z-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-11-10)
franz:-collision -hw
Creating z:/collision.zpaq at offset 0 + 0
Add 2023-11-10 12:31:55 2 1.280 ( 1.25 KB) 32T (0 dirs)
29604 SOMETHING WRONG ON messageA
GURU-C: on file messageA
GURU: CRC-32 from fragments 92433266
GURU: CRC-32 from file 072E2B0E
2 +added, 0 -removed.
0 + (1.280 -> 640 -> 1.803) = 1.803 @ 78.12 KB/s
##################
87571: Restoring this file will get incorrect data due to suspected SHA-1 collision(s)
<<messageB>>
#################
87538: SHA-1 collision detection time 31 ms
87653: Need a second pass on <<messageB>>
z:/collision.zpaq:
1 versions, 2 files, 1.803 bytes (1.76 KB)
AVAILABLE -stdout 1
Updating z:/collision.zpaq at offset 1.803 + 0
Add 2023-11-10 12:31:55 1 640 ( 640.00 B) 32T (0 dirs)
Warning: adjusting date from 2023-11-10 12:31:55 to 2023-11-10 12:31:56
1 +added, 0 -removed.
1.803 + (640 -> 640 -> 1.764) = 3.567 @ 9.92 KB/s
Now we extract the collisioned file, then check
Please note: we are extracting WITH zpaq 7.15, NOT with zpaqfranz ! (backward compatibilty is fully preserved, this is hard to achieve)
zpaq64 x z:\collision.zpaq -to z:\zpaqfixed
diff z:\zpaqfixed\messageA z:\zpaqfixed\messageB
Files z:\zpaqfixed\messageA and z:\zpaqfixed\messageB differ
The two files (messageA and messageB) are different (aka: restoring is OK even with SHA-1 collision)
Let's try with zpaq 7.15
zpaq64 a z:\undetected.zpaq message* -summary 1
zpaq v7.15 journaling archiver, compiled Aug 17 2016
Creating z:/undetected.zpaq at offset 0 + 0
Adding 0.001280 MB in 2 files -method 14 -threads 32 at 2023-11-10 11:37:02.
2 +added, 0 -removed.
0.000000 + (0.001280 -> 0.000640 -> 0.001739) = 0.001739 MB
0.031 seconds (all OK)
zpaq64 x z:\undetected.zpaq -to z:\zpaq715restored
diff z:\zpaq715restored\messageA z:\zpaq715restored\messageB
The two files (messageA and messageB) are THE SAME (aka: zpaq 7.15 fail to restore with SHA-1 collision)
In this release the recovery mechanism works for a single version. If there is a collision between two files, in two different versions, it will not be possible to restore them (I am working on it)
Command collision
Quickly check against SHA-1 collisions inside archive. This is faster than a "full" t (test)
zpaqfranz collision z:\1.zpaq
zpaqfranz collision z:\1.zpaq -all
Command dump
Show internal structure for not-multiparted, not-encrypted archives
zpaqfranz dump z:\kajo.zpaq
zpaqfranz dump z:\kajo.zpaq -verbose
zpaqfranz dump z:\kajo.zpaq -summary
zpaqfranz dump z:\kajo.zpaq -verbose -summary
"Truncate-Touching"
Fix back the archive timestamp, whenever no update is done
Arch Linux AUR package
-collision -kill
Just the first release of SHA-1 "full scale" recovery function
Bug fixing, a bit of refactoring
As a technology demonstrator I attach a Windows 64 bit executable, exp01.exe, which shows how it is possible (by breaking compatibility with zpaq 7.15), to quickly solve this problem. NOT RECOMMENDED FOR PRODUCTION USE, but only for TESTING. I repeat ONLY for testing!
*** DO NOT USE exp01.exe for anything other than theoretical study! ***
Windows 32/64 binary, 64 bit-HW accelerated
-home switch for add
It is possible to archive different folders inside different .zpaq, this is useful for splitting individual users (inside /home or c:\users) to different .zpaq
zpaqfranz a z:\utenti.zpaq c:\users -home
zpaqfranz a /temp/test1 /home -home -not franco
zpaqfranz a /temp/test2 /home -home -only "*SARA" -only "*NERI"
Support of selections in r robocopy commmand
Now you can select files just like the add command
zpaqfranz r c:\d0 z:\dest -kill -minsize 10GB
zpaqfranz r c:\d0 z:\dest -kill -only *.e01 -only *.zip
Fix for Mac PowerPC
Yes, someone compile zpaqfranz on PPC
Improved compatibility with ancient compilers on Slackware
Slack seems to run very old gccs
Workaroud for gcc's buggy versions
Newer gcc is bugged... too
A bit of refactoring
Slower but cleaner
Replaced $ to %, because Linux's scripts does not like $ at all
- %hour
- %min
- %sec
- %weekday
- %year
- %month
- %day
- %week
- %timestamp
- %datetime
- %date
- %time
Example: zpaqfranz r c:\d0 z:\backup%day -kill_
Examples for -orderby switch (in add)
zpaqfranz a z:\test.txt c:\dropbox -orderby ext;name
zpaqfranz a z:\test.txt c:\dropbox -orderby size -desc
filecopy with variable buffer size (-buffer)
Just for test on different platforms
The versum command takes only file starting with |
versum will elaborate now this kind of "strange" files
|SHA-256: 8A9C2486E9E9DAC489FC5748CF400359BB6DD5F10276429EED5F3E647DA25B0D [ 522.192.336.767] |pippo.zip
|SHA-256: 000064E741776F57D3961170A3C03679B45F37BCB1DD1A63FE5D288FD5374D94 [ 110.637] |pluto43/435396f46adc648df0a5f5c13667ee3cb9ea4eca
disclaimer after help for USE THE DOUBLE QUOTES, LUKE!
After each help
************ REMEMBER TO USE DOUBLE QUOTES! ************
*** -not *.cpp is bad, -not "*.cpp" is good ***
*** test_???.zpaq is bad, "test_???.zpaq" is good ***
Windows 32/64 binary,HW accelerated, ESXi, Linux, Free/Open BSD
Fix for older Windows (<10) console
I tested this version but zpaqfranz does not copy any file or folder :
Now zpaqfranz should autodetect (the dirtiest way) Windows version