We have a large collection of PFD, Word file and PPT of more than 16 MB size stored inside MongoDB GridFS. We need to perform content search on those documents and list the document which have the content. We have tried searching documents after retrieval and text extraction but that is very slow and not feasible since the number of documents will keep growing over the time.
Is there any other way we can achieve that? Have already searched SO for similar topic including below one but nothing helped so far -
Full-text search on MongoDB GridFS?
We have also tried for alternatives like elasticsearch however couldn't find any updated reference and example most of the available information is out of date and not updated. Any pointers will really help.