Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatched binary detection behaviour on diffs #6637

Open
FintanH opened this issue Sep 22, 2023 · 4 comments
Open

Mismatched binary detection behaviour on diffs #6637

FintanH opened this issue Sep 22, 2023 · 4 comments

Comments

@FintanH
Copy link

FintanH commented Sep 22, 2023

Hey 👋

First off, apologies that this is via the git2 crate, I'm better at Rust than I am C 😬 I also reported the issue on the git2 repository rust-lang/git2-rs#991.

Reproduction steps

If you clone this repository https://github.com/FintanH/git2-binary-check and do cargo run

Expected behavior

Path: "bin/cat"
Is binary: true
Path: "bin/ls"
Is binary: true
Path: "bin/test"
Is binary: true

I'll note here that git diff 373b3404371e359404392aff715d504293112c8b abdedcec89e3c48ac520e4a96c7ea39364316b9e will output:

diff --git a/bin/cat b/bin/cat
new file mode 100755
index 0000000..0bbb14b
Binary files /dev/null and b/bin/cat differ
diff --git a/bin/ls b/bin/ls
new file mode 100755
index 0000000..87c2d51
Binary files /dev/null and b/bin/ls differ
diff --git a/bin/test b/bin/test
new file mode 100755
index 0000000..0cced79
Binary files /dev/null and b/bin/test differ

Actual behavior

Path: "bin/cat"
Is binary: false
Path: "bin/ls"
Is binary: false
Path: "bin/test"
Is binary: false

Version of libgit2 (release number or SHA1)

I believe it's v0.16.1 based off of this output:

cargo tree
git2-binary-check v0.1.0 
└── git2 v0.18.1
    ├── bitflags v2.4.0
    ├── libc v0.2.148
    ├── libgit2-sys v0.16.1+1.7.1

Operating system(s) tested

NixOS 23.05 (Stoat)

@ethomson
Copy link
Member

I haven't dug in here yet, but I suspect that this is an optimization around not loading the content when there's no content callbacks configured.

See the documentation for git_diff_delta:

Under some circumstances, in the name of efficiency, not all fields will
be filled in, but we generally try to fill in as much as possible.  One
example is that the "flags" field may not have either the `BINARY` or the
`NOT_BINARY` flag set to avoid examining file contents if you do not pass
in hunk and/or line callbacks to the diff foreach iteration function.  It
will just use the git attributes for those files.

This is not the first time that I've been bitten by this. Probably this behavior should be optional.

But in the meantime, if you wire up a callback do you get the expected behavior?

@FintanH
Copy link
Author

FintanH commented Sep 25, 2023

Ya, you're right. When I added the call to foreach:

diff.foreach(&mut |_, _| true, None, None, None).unwrap();

just before iterating over the deltas, I do get back that the binaries are true.

@ethomson
Copy link
Member

OK, thanks for validating that. I'm going to keep this open because I want to change this behavior somehow.

@FintanH
Copy link
Author

FintanH commented Sep 25, 2023

OK, thanks for validating that. I'm going to keep this open because I want to change this behavior somehow.

Appreciate that!

FintanH pushed a commit to radicle-dev/radicle-git that referenced this issue Apr 18, 2024
Iterates over the git2::Diff entries to force libgit2 to do the binary detection.
Also toggles the `skip_binary_check` in the `DiffOptions`.

All this to workaround the libgit2 lazy flag setting[^0]

[^0]: libgit2/libgit2#6637
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants