Skip to content

Etags and file ids

Christian Kamm edited this page Jul 17, 2019 · 5 revisions

This page clearifies about sync relevant file meta data that is shared between server and clients. Everybody who is developing on either server or client should understand that.

ETags are a file meta data which the server maintains for every file and every directory. Etags are exposed through WebDAV properties queried through WebDAV PROPFIND.

ETag Uniqueness

It is unique for every file or directory. The generation of the ETag should have enough entropy that collision do not happen in practise.

As a result for a user there are never two files with the same ETag existing. (Unless the two files have the same contents (and mtime), in which case they may have the same ETag)

ETag Change

ETags only change if the content or metadata of an individual file changes. ETag of a directory changes if a file or file metadata somewhere underneath the directory changes, that means that every change of a file somewhere in the tree propagates up to the root directory and changes the ETag of every parent directory. That includes the ETag of the root directory.

The ETag of the file does not change if the file is moved on the server. However, the directories it is moved from and to (and all their parent directories) change their ETag.

The ETag change needs to recursively propagate an ETag change of the parent directories because this is the way a syncing client can detect in which directory something has changed server-side. Otherwise the syncing client would need to look into each and every directory.

It is extremely important that ETags must not change on adminstrative tasks such as file cache rescanning.

ETag and Sharing

The ETag of a shared file is equal to the ETag of the file of the original owner.

This implies that the server has to prevent users from doing loop sharing. That means that user Tom shares a directory with two users Betty and John. Now these both decide to share the folder with Carl. So the same shared folder would appear twice on Carls data set. That must not happen.

File IDs

The purpose of file ID is to make it possible for the clients to detect server side moves. A File ID is a meta data of each file and directory maintained by the server and exposed through WebDAV properties.

File ID Uniqueness

File IDs are unique for every file or directory. The generation of the File ID should have enough entropy that collision do not happen in practise. It also considers the instance ID of the server so that file IDs are unique world wide.

The file ID is a 64 bit unsigned integer.

File ID Change

File IDs never change for the lifetime of a file. When a file is created a file ID is assigned. That stays the same until the file is removed. File IDs must not be recycled.

As for ETags it is important that the file ID does not change during administrative tasks.

File ID and Sharing

The file ID of a shared file is the same as the one of the original file.

ETags vs File IDs

An ETag identifies the content of a file. A File ID identifies the file itself.