Skip to content

Find duplicate files under a set of directories (matching name, _fuzzy name_ or md5 checksums)

License

Notifications You must be signed in to change notification settings

lonetwin/finddup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

finddup

Command to look up duplicate files under a set of directories (using name, fuzzy name or md5 checksum matches) and report either the duplicate files or files without duplicates.

Examples:

# find duplicates using md5 checksums of the first 4k bytes of files under /home/foobar
$ python finddup.py -m /home/foobar

# find duplicates using md5 checksums of the first 8k bytes of file under /home/foo
# and /home/bar
$ python finddup.py -m -b 8k /home/foo /home/bar

# find duplicates using md5 checksums of the first 8k bytes of file under /home/foo
# and /home/bar and report files that do not have duplicate copies in either of the
# directories
$ python finddup.py -I -m -b 8k /home/foo /home/bar

# find duplicates by 'fuzzy' matching the names of files under /home/foo and /home/bar
# and /tmp/baz
$ python finddup.py -f /home/foo /home/bar /tmp/baz

# find duplicates by matching the exact names of files under /home/foo and /home/bar
# and /tmp/baz
$ python finddup.py /home/foo /home/bar /tmp/baz

If you find this useful, or have comments/suggestions, please let me know.

About

Find duplicate files under a set of directories (matching name, _fuzzy name_ or md5 checksums)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages