Skip to content

In Place Updates

Jakob Borg edited this page Jul 20, 2015 · 1 revision

Here's a suggestion for a mechanism for safe, in place updating of files. The intention is for this to be an optional mode of operation, used where the space or I/O overhead of creating a temporary file copy on sync is undesirable.

Constraints and Considerations

  • We should strive to minimize the time during which the target file is in an inconsistent state, or the time this state is exposed to the user.

  • Overwriting an existing block in a file may fail under surprising circumstances, i.e. with "out of space" on filesystems with snapshots or sparse files even if we're not growing the file.

  • With periodic indexing, the target file may be in a different state than we think it is due to local changes not yet being discovered. In place updates can then result in an inconsistent result. We should strive to minimize the risk of this.

  • Syncthing, or the device it's running on, may crash at any point during an in place update operation. This should be handled as gracefully as possible once Syncthing is again up and running.

Suggested Mechanism

When in place updates are requested, Syncthing could do the following instead of the normal process of creating a temporary file with the desired contents and renaming:

  • Create a journal file beside the file to be updated. The journal file contains relevant metadata on the file to be updated (basically old and new FileInfo structures).

  • Add block data (basically <offset><length><old-data><new-data>) for each block to be replaced to the journal file. This means reading the old data from the existing file, verifying that it matches the expected hash, and requesting blocks from the network as required.

  • When the journal file is complete, commit it by applying it to the target file and removing it.

Journal Commit

For each block in the journal, do an overwrite of that section of the file. If this goes without error, update the metadata in the index and remove the journal file.

Journal Rollback

For each block in the journal, do an overwrite of that section of the file with the old data. If this goes without error, remove the journal file.

Error Handling

Network Outage, Blocks Unavailable

Journal file doesn't get completed and can't be committed. We treat it as any temporary (keeping it around for 24 hours or so) while retrying those blocks.

I/O Error Writing to Journal

As above.

Journal File Detected at Startup (Scanning)

If the journal is complete (as indicated by comparing the block lists in the metadata with the data present in the journal), attempt to commit it. Otherwise leave it as a temporary file and handle as above.

I/O Error Writing to Target File

Not sure what to do here. We have the journal describing the old contents so we can and should attempt a rollback, but it seems unlikely to succeed any better than the in place updates did in the first place. Stop and complain. Journal can be committed or rolled back when underlying error is resolved.

The file is in an unknown, inconsistent state at this point.

User Changed the File Locally While We Were Building The Journal

Currently we don't detect this, so the user's changes may get overwritten, but in that case with a consistent file created by someone else. Such is the nature of bidirectional syncing. (We could of course check the timestamp just before replacing the file, and there's a ticket on that somewhere on GitHub.)

With in place updates however this becomes worse as we may end up with a broken file with mixed contents, so detecting this becomes more important.