My question is somewhat related to "How to improve searching with os.walk and fnmatch" but I want to expand a little.
Let's assume we have a file collection on a harddrive, which is about 10-50 TB big. I want to find all files with a specific ending on a regular basis. The collection changes daily as new files are added. In a first run, I want to store the attained information, so that in the following runs only the changed files have to be searched which I understand as some kind of indexing of the file system and hope to greatly speed up every consecutive search.
I prefer working in python, but a hint to a readymade software-solutions as well as open-source projects in other languages is greatly appreciated.