Skip to content

Takes an input CSV and produces a CSV of duplicate records. Then the input CSV is cleansed to remove duplicates.

License

Notifications You must be signed in to change notification settings

pattyjula/duplicate-cleansing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

duplicate_cleansing

Takes an input CSV and produces a CSV of duplicate records. Then the input CSV is cleansed to remove duplicates.

First run find_duplicates.py, then, if there is more than the header line in the dupes file, remove_duplicates.py will run and the clean file will be created.

About

Takes an input CSV and produces a CSV of duplicate records. Then the input CSV is cleansed to remove duplicates.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages