Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Compare Subdirectories, ignore directory name difference #710

Open
miketheman opened this issue May 6, 2024 · 0 comments

Comments

@miketheman
Copy link

Hello! Thanks for making this tool - it's quite cool and fast.

I've been using difft as well as diffoscope and was curious if you had a solution to a problem diffoscope solves, that I don't think difft handles today.

Here's my current use case, let me know if it doesn't make sense:

I often expand multiple zip files containing a variety of files, and want to establish a few things:

a. How "same" are these zip file contents?
b. What are the distinct differences?

For a, I'd love a way to emit only a percentage as output, somewhat similar to --check-only - that way I could run a lot of diffs and only stop when something is either too similar or too divergent.

I think difft solves most of b, but doesn't handle path name differences yet, unless I don't have the right flags.

Here's an example expanded layout of two almost-identical zip files:

analyticsclient-6502
├── LICENSE.txt
├── MANIFEST.in
├── PKG-INFO
├── README.md
├── analyticsclient.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   ├── requires.txt
│   └── top_level.txt
├── data
│   └── data_file
├── pyproject.toml
├── setup.cfg
├── setup.py
└── tests
    ├── __init__.py
    └── test_simple.py

4 directories, 15 files

brotli-bin-0.0.1
├── LICENSE.txt
├── MANIFEST.in
├── PKG-INFO
├── README.md
├── brotli_bin.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   ├── requires.txt
│   └── top_level.txt
├── data
│   └── data_file
├── pyproject.toml
├── setup.cfg
├── setup.py
└── tests
    ├── __init__.py
    └── test_simple.py

4 directories, 15 files

Both contain same files, but have slightly differing different paths.

Using diffoscope I execute: diffoscope --exclude-directory-metadata=yes analyticsclient-6502 brotli-bin-0.0.1 and get an output of the differences for the PKG_INFO files in the *-info sudirectories side by side - but difft can't compare those yet.

(--exclude-directory-metadata=yes is to remove comparing the times and dates on the files, but I think that's diffoscope-specific and difft doesn't care about that yet.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant