Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Normalized VCARD contents? #157

Open
mlawren opened this issue Mar 4, 2022 · 5 comments
Open

[Feature] Normalized VCARD contents? #157

mlawren opened this issue Mar 4, 2022 · 5 comments

Comments

@mlawren
Copy link

mlawren commented Mar 4, 2022

Each CardDAV client seems to have its own idea about how vcard files are formatted, for example:

  • upper and lowercase TYPE= values
  • ordering of fields
  • CRNL at end of file or not

This makes the Git diff history rather noisy and therefore hard to debug when investigating what clients are actually changing.

It would be ideal if xandikos (or some underlying library) would parse and format VCARD files before commit in a standard way. I.e:

  • one case for TYPE= values
  • alphabetical field ordering (after the mandatory BEGIN, VERSION and N)
  • newline after last field

This is perhaps a bit of a naive request as I don't know if modifying what the client provides affects synchronization....

@jelmer
Copy link
Owner

jelmer commented Mar 4, 2022

I'd be open to this in general, but last time I tried a naive approach that didn't work.

Some clients have a specific idea about the formatting of files, and will do normalisation themselves. If their formatting differs from what the server uses you can end up in this endless loop where the server and client both keep reformatting. Of course this issue also exists when there are two clients that keep reformatting files.

Downside of normalizing is also in terms of overhead - a client that does an upload will have to then also pull down the newly reformatted file - i.e. it looks like somebody has changed the file after they uploaded.

This is why I've taken the approach of rejecting invalid files but not modifying files that are uploaded. This is also consistent with the CalDAV/CardDAV/WebDAV specs.

I'm open to the idea of optionally reformatting, if we can come up with a good way of doing so.

@mlawren
Copy link
Author

mlawren commented Mar 7, 2022

Some clients have a specific idea about the formatting of files, and will do normalisation themselves. If their formatting differs from what the server uses you can end up in this endless loop where the server and client both keep reformatting. Of course this issue also exists when there are two clients that keep reformatting files.

Downside of normalizing is also in terms of overhead - a client that does an upload will have to then also pull down the newly reformatted file - i.e. it looks like somebody has changed the file after they uploaded.

I'm sorry that I don't know anything about the CardDav protocol, but what mechanism does a client base its decision to pull on? How does it know that what got stored is different to what it uploaded? Presumably some kind of hash or last-modification-date?

What about normalizing what's stored in Git, but reporting to clients the last received or uploaded hash/date value?

@jelmer
Copy link
Owner

jelmer commented Mar 7, 2022

The clients regularly poll to see whether files have changed. There are different ways of doing this. Right now, in the most efficient mode the client provides a sync token. Xandikos simply relies on the Git tree and blob shas to determine whether things have changed since that token, without keeping any additional data - that's part of its power. Whenever those SHAs change, the contents have changed and that triggers a fetch from the clients. Keeping two sets of shas (and potentially two sets of files), one normalized and one not, will open another can of worms. It also will do the opposite of making it easier to debug what's going on, because now clients have a different version of the file for the same ETag (at least in terms of formatting) than the server.

I think there is something to be said for (optionally) normalizing the files on the server, but that will have to involve also synchronising those changes to the clients. It's extra network overhead at the cost of more consistent files at the server side, essentially - and that may be a reasonable tradeoff in some cases. Let's see whether using the branch lock works per #156

Another option would be to normalize the output you're getting out of git repository before diffing, when analysing the history?

@mlawren
Copy link
Author

mlawren commented Mar 10, 2022

Well I thought I'd worked out an alternative way to normalize using Git filters instead of hooks. In .git/config:

[filter "vcard"]
        clean = vcardtidy
        smudge = cat

And in .gitattributes:

*.vcf   filter=vcard

The above works fine for manual commits, but it seems xandikos (dulwich?) doesn't take the above configuration changes into account. So I guess I'm going to have to investigate GIT_EXTERNAL_DIFF...

@jelmer
Copy link
Owner

jelmer commented Mar 10, 2022

Filters could potentially work here (though not currently supported), although they'd have the same effect as hooks in terms of interactions with clients when #516 is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants