Skip to content
This repository has been archived by the owner on Jan 4, 2024. It is now read-only.

New Recovery Index File Format Standard

Zachary Adam-MacEwen edited this page Jan 23, 2019 · 7 revisions

Standardising the new Recovery Index File Format (RIFF)

Since Tapestry 0.2.1, Tapestry has made use of a simple pickle object, serialised and written to a text file, as its recovery file. While this design suited the original purpose and is sufficiently robust to be trusted within versions, the structure of this file and its consequential volatility have always made its continued operation between versions of Tapestry more an article of faith than a reliable fact. In light of this, and in order to add support for new features, the 2.0 Major Release will incorporate a new Recovery Index File Format.

From Pickle to JSON

For a few reasons, not the least of which is elegance, the new RIFF will be encoded not as a pickle, but as a fully-featured JSON object, with all the robustness and advisability that implies. The purpose of this change is twofold:

  • Adding keyed structure to the RIFF will make testing for changes in the RIFF more granular, and determining the nature of those changes more elegant, and;
  • The RIFF could now be fed directly to a Master Namespace object and absorbed into that namespace.

Fields of the New RIFF

The New RIFF can be conceptualised as a series of tables or dictionaries with specific expected keyed values.

Block Metadata (metaBlock)

This table contains information which pertains directly to the metadata of the block itself, ex: Tapestry-YYYY-MM-DD-1.tap.

Fullname Key Type Description
Block Index Number numBlock Int The number-in-sequence of the block. Matches the original value in the filename unless that name has been changed.
Uncompressed Size sizeLarge Int The sum filesize of all files included in the block, expressed as an integer number of mibibytes.
File Count countFiles Int The number of files (not counting the RIFF) included in the file at creation.

Overall Metadata (metaRun)

This table contains information pertaning to the metadata of the overall run of Tapestry which created this series of files.

Fullname Key Type Description
Total Block Count sumBlock Int The total number of tapfiles (not counting signatures) which constitute this backup
Uncompressed Size sizeExtraLarge Int The total size of all files in the entire record, uncompressed, expressed in mibibytes
File Count countFilesSum Int Total count of all files (except RIFFs) contained in all blocks in this series.
Date Stamp dateRec Int Unix Timestamp of the start of the run of the creating process. Should match the date of the filename.
Comments comment Str Stringified comments taken from the user at the time of creation, or passed by the responsible chron job

Recovery Index

This table is actually a large table of smaller objects, each referenced in the index subobject by the file's UUID.

File Entry

Each file entry has the following structure:

Fullname Key Type Description
Filename fname string The base name of the file, with extension, if any.
Checksum sha256 string The sha256.hexdigest() of the file prior to compression. Used for the integrity check functions.
Category Label category str The category label for the original configuration.
Subcategory Path fpath string The path from category down to the file, including the fname.
File Size fsize int Raw File Size in bytes

Example RIFF

This is the example RIFF used in the unit-tests.py script.

{
  "metaBlock": {
    "numBlock": 1,
    "sizeLarge": 2000,
    "countFiles": 300
  },
  "metaRun":{
    "sumBlock": 1,
    "sizeExtraLarge": 2000,
    "countFilesSum": 300,
    "dateRec": 1540139133,
    "comment": "This is just a test RIFF."
  },
  "index":{
    "fidentifier": {
      "fname": "somefilename.png",
      "sha256": "aefaf7502d52994c3b01957636a3cdd2",
      "category": "files",
      "fpath": "/go/to/somefilename.png",
      "fsize": 200
    }
  }
}