Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

borg diff: add json output #3765

Closed
phdoerfler opened this issue Apr 11, 2018 · 7 comments
Closed

borg diff: add json output #3765

phdoerfler opened this issue Apr 11, 2018 · 7 comments
Milestone

Comments

@phdoerfler
Copy link

Right now borg diff shows something like this:

+27.4 kB  -27.4 kB var/vmail/example.com/foobar/Auto/Steam und Co/dovecot-uidlist
+490.4 kB -489.3 kB var/vmail/example.com/foobar/Auto/Steam und Co/dovecot.index.cache
 +29.6 kB  -29.3 kB var/vmail/example.com/foobar/Auto/Steam und Co/dovecot.index.log
added      62.77 kB var/vmail/example.com/foobar/Auto/Steam und Co/new/1520542202.M174402P24533.turtle,S=62768,W=63599
  +4.5 kB   -4.5 kB var/vmail/example.com/foobar/maildirsize

And that's great. Except these human readable size informations are a bit tough for a machine to parse. Ideally there would be an option to output machine readable sizes (aka just the bytes, without any prefix).
Speaking of prefix: Are these kB as in 1000 Bytes or KiB as in 1024 Bytes? While at it maybe this could be clarified, too. Thanks!

@ThomasWaldmann
Copy link
Member

Did you check whether we have json output support for that?

@phdoerfler
Copy link
Author

I did indeed check the borg help diff for that. The only thing related to JSON I found is this:

  --log-json            Output one JSON object per log line instead of
                        formatted text.

So no luck there I am afraid.

@ThomasWaldmann ThomasWaldmann changed the title Feature Request: borg diff: Add machine readable information about how many bytes were changed borg diff: add json output Apr 14, 2018
@ThomasWaldmann
Copy link
Member

OK, if we do not have that yet, adding json output seems to be a good idea.

@ThomasWaldmann ThomasWaldmann added this to the 1.1.x milestone Apr 14, 2018
@JonasKvarnstrom
Copy link

In that case I would also like to request including both the uncompressed and compressed size of each change. As far as I can understand the Borg source code, the current output refers to the uncompressed size.

@Ashmodei
Copy link
Contributor

I propose to add an option --json-output for diff.
And a diff output should be something like that:

{
 added: [
        {path: '/path/to/file', change: '27.4 kB'}
        ],
 modified: [
        {path: '/path/to/file', change: '+490.4 kB -489.3 kB'}
        ],
 deleted: [
        {path: '/path/to/file', change: 'directory'}
        ]
}

Given this way, users will get easy way to parse the output, cause they will have separate groups and fields.
So, I'm going to add inner function print_output_json to do_diff, which will produce the output like above from diffs generator.

Also for not-human-readable format we can add an option --bytes and show file size without units. I think it would be nice to have this option and for standard output too.
Now diff function uses overridden ItemDiff.__repr__() that in turn uses ItemDiff._content_string() to get difference representation. As we can't add arguments to __repr__ In my opinion it would be g
ood to add separate function and put current __repr__ content there. __repr__ will invoke that function to get standard output. The function will take an argument(e.g. in_bytes) and pass that
to Item.get_size(). Frankly, I haven't dived in Item.get_size() yet, but I think I can get item size in bytes.
Any thoughts, suggestions?

@elho
Copy link
Contributor

elho commented Jun 22, 2019

I propose to add an option --json-output for diff.

That should either be --json or --json-lines (depending on which it outputs) for consistency with other commands.

And a diff output should be something like that:
Given this way, users will get easy way to parse the output

No, that looks rather akward to parse and recombine to something useful. The whole change part is no better than parsing the existing non-JSON output, I'm afraid.
I'd rather suggest one entry by path, having a type field for file/directory/hardlink/softlink etc., a change list that lists change types, e.g. added, deleted, modified (or content 🤔 ), owner, mode, etc., sizes should be given in bytes of course, and probably rather like sizes { old: 12345, new: 12346}

@ThomasWaldmann
Copy link
Member

Hmm, looks like this can be close, see merged PRs.

borg diff --json-lines repo::archive1 archive2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants