Skip to content

Releases: lrq3000/pyFileFixity

pyFileFixity v3.1.4

26 Apr 20:44
Compare
Choose a tag to compare

Minor release since v3.1.1:

  • Standardize opening of csv files in UTF-8 format in rfigc.py (pff hash subcommand) and improve r
  • Improve tests reliability across platforms (ensure git does not convert line return characters)
  • Improve documentation (add instructions to use LTO in a curation strategy)

pyFileFixity v3.1.1

09 Apr 10:02
Compare
Choose a tag to compare

Minor release since v3.1.0, just fixed a few errors in the readme and improved continuous deployment build process.

pyFileFixity v3.1.0

09 Apr 09:36
Compare
Choose a tag to compare

Major update! 🚀

  • Overhaul build with PEP517 compliant packaging, which allows for isolated builds!
  • Compatibility with reedsolo>=2.0.0-dev (uses Cython v3 and allows up to 15 MB/s of encoding rate!).
  • Improve continuous integration, now tests for latest Python release, and multi-OSes.
  • Add continuous deployment via GitHub Actions, with automatic test on TestPyPi first to ensure the package is installable, and with continuous integration unit testing before.
  • Add new centralized API pff script, which exposes subcommands to access all the suite of tools offered by pyFileFixity.
  • As part of the pff centralized API overhaul, tools were renamed to more understandable names, such as hash for rfigc.py, although older names are still useable as aliases.
  • Several bugfixes, including of bytearrays management, and memory allocation, which made the software get slower and slower over long periods of time.
  • Drop support for Python 2 (Tried to keep support with a dedicated setup.cfg, but although it technically works, the build process is so unclean, with so many dependencies resolving lookup requests, that it's just better to drop it and let Py2 users use v3.0.2 which was fully compatible).
  • Add CodeQL code quality analysis workflow on GitHub Actions.
  • Improve ecc_speedtest to be configurable by commandline arguments instead of hardcoded config variables.

For the full changelog since the previous release:
v3.0.2...v3.1.0

Full Changelog: v3.0.8...v3.1.0

pyFileFixity v3.0.8

31 Mar 00:39
Compare
Choose a tag to compare
pyFileFixity v3.0.8 Pre-release
Pre-release

Some more ironing of a few bytearray encoding bugs, especially ensuring compatibility with reedsolo >= v2.0.0, with creedsolo cythonized extension allowing up to 15 MB/s of encoding rate!

/EDIT: this release was yanked because reedsolo v2.0.0 was also yanked since it requires Cython v3.0.0b2, which is not available in most linux distros since it's a prerelease. Furthermore, reedsolo>2 dropped support for Py2, so the last really compatible release of pyFileFixity is pyFileFixity v3.0.2 which used reedsolo==1.7.0 .

pyFileFixity v3.0.2 beta

09 Dec 04:26
Compare
Choose a tag to compare

🎄✨ Annual Christmas maintenance! 🎅🎊

Ok it's been more than a year since last maintenance, but the wait was worth it!

Now the module is fully compatible with Python 3 up to Python 3.11 and even PyPy 3! All unit tests pass, with a branch coverage of about 84%! (Note that reedsolomon submodules aren't included in the coverage anymore, but they are also covered at about or more percent on their own).

This release is considered beta because I did not have time to thoroughly test in practice, but the continuous unit test (which has now migrated to GitHub Actions to resume service since Travis-CI shut down free service) hopefully should be a good indication of stability and robustness, it should work as before it did on Python 2.7, but now on Python 3.

Merry Christmas to everyone, may all your wishes come true!

pyFileFixity v2.3.1 stable

07 Dec 23:39
Compare
Choose a tag to compare

Better, more pythonic packaging + Added coverage for the two new scripts replication_repair.py and resiliency_tester.py, and fixed a few edge case bugs here and there (particularly in replication_repair.py).

pyFileFixity v2.0.3 stable

06 Dec 23:24
Compare
Choose a tag to compare

First stable release of the new v2 branch! New (major) features since beta include:

  • 2 new applications: replication_repair.py and resiliency_tester.py.
  • More robust ecc files decoding (filesize intra-field is now protected by an ecc, etc.)
  • Enhanced filetamper.py (more stats, more robust tampering, etc.)
  • Better packaging (pyFileFixity is standalone, you just need a native Python interpreter and nosetests and you can run the tests and any script!).

Major features since v1 branch:

  • Unit test, with branch coverage!
  • More powerful Reed-Solomon libraries, with support for universal decoding/encoding and erasures (doubles the number of errors that can be corrected)!
  • Crossplatform fix
  • Various fixes, making the scripts a lot more robust against errors and exceptions.
  • GUI support for all scripts
  • And various other changes that I forgot...

Feel free to feedback with your experiences using pyFileFixity!

pyFileFixity v2.0.0 beta 2

16 Nov 04:50
Compare
Choose a tag to compare
Pre-release

Major milestone again, the codebase is now branch covered at more than 80%, which is honestly a great deal more than what I thought was possible to do for a first coverage. This score means that all core functions are covered, only a few specific arguments (like skipping missing files and such) are not (yet) tested, but apart from that, you can assume that files generated by this app are safe and stable.

This means that the application is now a lot more stable (and more crossplatform: files generated on one platform can now be used on any other platform, at least theoretically, but it's safe to say that files generated on Linux and Windows are totally compatible, for other platforms we need testers!), but this also means that some big changes had to be done in the inner workings of the app. Normally, ecc files generated with v2.0.0 beta 1 should still be compatible with beta 2, but if you have generated such ecc files, you should try them with beta 2 to ensure they still work, and if not, regenerate new ones.

From now on, the code should stay pretty much stable, and the high coverage score is here to ensure that.there won't be any regression.

Before the stable release, only a few new scripts need to be developped, like replication_repair.py (see TODO.md for more info).

pyFileFixity v2.0.0 beta 1

19 Oct 13:59
Compare
Choose a tag to compare
Pre-release

Major new codebase for pyFileFixity. The project is now most likely stable, and speed enhancements are enough to make the usage practical, and the Reed-Solomon ecc libraries are rock solid (unit tested at more than 85%, these are the most unit tested reed solomon libraries in the open source world currently).

The file formats for the ecc files should be frozen now (there won't be any change), so you can already generate your ecc files now, you will be able to use them with later versions of pyFileFixity without a problem. CAUTION: the format has changed since the last releases, so do not try to use old ecc files with this release, you should generate a new ecc file!

There are a few enhancements at repair and some additional tools I'd like to make before tagging this project as stable. These planned enhancements are described in TODO.md, such as the replication repair tool. The enhancements will be exclusively be done on repairing processes, not on generation, so that's why the ecc file format won't change (and you will benefit of the future repairing enhancements even if you generate your ecc files now).

pyFileFixity v1.4

24 Apr 17:31
Compare
Choose a tag to compare

This release fixes the issues of pyFileFixity v1.3. This release is stable.

IMPORTANT: Note that this release is incompatible with previously generated ecc files. You should regenerate all your ecc files with this new version!

Huge update, with lots of new features. First, there is a huge speedup improvement, about 100x (expect an encoding rate of more than 1MB/s), which is really really good and makes the usage of header_ecc.py and structural_adaptive_ecc.py possible in concrete, real world scenarios.

Another important improvement is in the resilience rate calculation: the old one was off, it protected the files more that you specified. This is fixed, so that you will know exactly how much your files are protected.

Another huge improvement is the reliability of the correction using ecc files: now even if the hash is corrupted, the blocks can still be checked by the ecc directly to see if the correction was successful.

Also, intra-ecc was added so that filepath are now protected by ecc too! This means that generated ecc files do not have a critical spot anymore, they should work flawlessly even if tampered (up to the resilience rate that you have set). However, adding intra-ecc broke compatibility with previous releases, so that you cannot decode old ecc files generated with previous versions of pyFileFixity. You should regenerate all your ecc files with this new release.

A last improvement is with rfigc.py, which now has a --filescraping_recovery option to use a database file to recover the filenames and directory tree structures of files that you data scraped on your disks following a data disaster (for example using Photorec).

The project is stable now. The main thing I still would like to improve on is a bit more speed in the RS library (I would like to reach over 10MB/s). If you can help with the Cython or Boost Python implementation, send me a pull request please! There are other features waiting to be implemented, but they aren't necessary for now (such as updating an ecc file, meanwhile you can just regenerate one).