How to find missing files among 10,000 of them
The other day, I created 10,000 NFT art using hashlips engine.
I tried to upload all of them using IPFS desktop client. I certainly uploaded all files, but I noticed that there were so many missing files in the uploaded list. It seems some of them were not accepted due to whatever reasons. Maybe I put too many files at once to upload.
I wanted to know which ones are missing, but among nearly 10,000 items, checking manually is way too inefficient. IPFS desktop client is not equipped with even a simple search feature. I started making the missing file list literally with a pen and paper.
I covered only about a few hundred items using an hour. There were still more than nine thousand and seven hundred items to go through. It could take a week.
So I downloaded all files from IPFS client, print the downloaded files in a terminal. Copied the all names, then I wrote a script.
const files = `
1.png 2.png ...
5010.png 5011.png 5013.png ...
9987.png 9999.png
`
const filenames = files
.replace(/(\r\n|\n|\r)/gm, "") // removes carriage returns and new lines
.replace(/\s+/g, " ") // replaces multiple whitespaces with a single whitespace
.split(/\s+/g) // convert string to array; whitespace is the delimiter
filenames.shift() // removes the junk at the initial index
filenames.pop() // removes the junk at the last index
const fileNumbersWithoutExtension = filenames.reduce((acc, cur) => {
const item = cur.replace(".png", "");
acc.push(parseInt(item, 10));
return acc;
}, []);
const sorted = fileNumbersWithoutExtension.sort((a, b) => a - b); // sorts ascending order
const missing = findMissing(sorted)
console.log(JSON.stringify(missing, null, 2)) // [678, 989, 2321, 2897, 4090, 6112, 9899]
// depending on how many items you handle, but when the number is huge
// your terminal won't print everything; JSON.stringify helps
const findMissing = num => {
const max = Math.max(...num)
const min = Math.min(...num)
const missing = []
for (let i = min; i <= max; i++) {
if (!num.includes(i)) {
// checks whether i(current value) present in num (argument)
missing.push(i) // adding numbers which are not in num (argument) array
}
}
return missing;
}