Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
-
Updated
Nov 22, 2022 - Python
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
Advanced Duplicate File Finder for Python
🍰 A library for creating n-grams, skip-grams, bag of words, bag of n-grams, bag of skip-grams.
Program to scan and search for file duplicates. (~300MB/s)
Command Line Interface for deplicate
Takes an input CSV and produces a CSV of duplicate records. Then the input CSV is cleansed to remove duplicates.
Command line utility to remove exact duplicate files.
A no-nonsense .NET Core 2.1 CLI duplicate files remover
A tool that deduplicates lines of a textfile with the speed of ram and scales nicely on all cores concurrently.
File duplicate remover for Synology DSM 213j+
rm-dup is a script to remove duplicate files
Created modified Levenshtein distance algorithms, to match strings by deletion and capitalization only and does not allow replacement or insertion of characters
Sort, uniq, reverse, and randomize data
Conducting EDA on Instacart orders
🍰 A library for creating n-grams, skip-grams, bag of words, bag of n-grams, bag of skip-grams.
Data Manipulation of Biopic Dataset
Function that removes duplicate items and objects based on a key from an array of objects.
Java program to remove or find duplicates in a string
Searches for duplicates in two separate folders allowing removing duplicated files from one and keeping another intact.
Add a description, image, and links to the duplicates-removed topic page so that developers can more easily learn about it.
To associate your repository with the duplicates-removed topic, visit your repo's landing page and select "manage topics."