Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Condition in Protocol #45

Open
loeffel77 opened this issue Nov 26, 2022 · 2 comments
Open

Race Condition in Protocol #45

loeffel77 opened this issue Nov 26, 2022 · 2 comments

Comments

@loeffel77
Copy link

Because most file sync protocols can process files in parallel and/or in random order, there is a race condition in the protocal, that might lead data not beeing synchronized.

The protocol specifies (and reference implementation implements): if the sequence file of an application indicates a change of an entry file, then the following will happen:

  • Execute all the new entries.
  • Update the sequences file in our local directory.

However, if the sequence file is synchronized and processed, before the corresponding entry file is synchronized the following will happen:

  • No new entries will be found.
  • The update of the sequence file in the local directory will prevent the (future) processing of the eventually updated entry file.

To my understanding this race condition cannot be detected with the current protocol version, because there is no link between entry sequence and entry contents. An extension of the protocol could make this race condition detectable by storing an entry file contents hash togehter with the sequence number in the apps sequence file. In this case the processig can be posponed until the entry files content hash matches the hash stored in the current sequence file.

@39aldo39
Copy link
Owner

Yes, good find! It can lead to ignored entries until the same file gets new entries added. It should be unlikely as the time window where the syncing of the files started but is not yet finished should be small. But it should be addressed in the next version of the protocol.

@java-py-c-cpp-js
Copy link

I think you do not need to make an entire new protocol version (with breaking changes) for fixing this issue.
You could try to make a "soft fork" that is backward-compatible.
For this,

  1. the newer version of the protocol adds the hashes in a way that is ignored by older versions (backward compatible)
  2. If a newer version find this hashes, it will use them for additional checking so it knows when to wait for a missing sync.

On example for this would be an extra anti-race-condition-hashes.csv file, that contains this additional data.
They would link a specific version of the sequences file to hashes of the corresponding entry files.

If the newer version detects that an older version has done the updating, it will skip checking the extra hash data for ensuring backward compatibility and easy migration.

@39aldo39 Do you think that this or something similar would be possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants