I have a folder one my Red Hat server with approx. 500k files from various extensions. The name convention for those files is based on a number, for example:
- a123456.csv
- z123456.jpg
- 123456.exe
- a234.jpg
- 234.exe
I designed a query which produce a list of all the numbers that should be deleted. Assuming i export this list daily/weekly into a txt file, what would be the most efficient way to delete all the files from the folders which appears in the list?
Running a for loop on every folder would take too long since there are too many files. I managed to produce a list of all the numbers to delete which have files in this folder using:
join <(cat list.txt | sort) <(ls /folder/with/0.5Mfiles | grep -v html$ | sed 's/[a-zA-Z.]*//g' | sort)
but that way I lose the original file name (e.g. z123456.jpg)
What could be the most efficient way to do it?