So the gist is after I extracted the OCR/tesseract data from a pool of images, I then run re.findall(r'example')
How would I fetch the source file that has an "Mountain" word?
It's still a bit vague in my part. Can you help out. Thanks!
for index, row in df.iterrows():
result = row['text']#from the OCR
file_1 = re.match(r'Mountain', result)
file_2 = re.match(r'Lake', result)
if file_1:
print #how do I fetch/get the original file that has the matching word for file_1