How do I insert a pdf or word file in elastic as a document. Will elasticsearch store files? If so please point me to the relevant documentation and some information regarding the same.
Asked
Active
Viewed 381 times
1 Answers
1
You can use Mapper Attachments plugin to extract and index text of PDF and Word documents. I would advice though to do text extraction outside of elasticearch and just send the text to elasticsearch for indexing. Text extraction is tricky process and outside of elasticsearch you will have more choice for the extraction mechanism as well as bugs in the extraction library will not affect stability of elasticsearch.

imotov
- 28,277
- 3
- 90
- 82
-
I want to store documents in elastic..Can than be done? – Pooja Sep 21 '15 at 17:52
-
Am I supposed to encode the fileto base64..I have a large number of pdf files.It becomes highly inconvenient to do so. – Pooja Sep 21 '15 at 19:32
-
That's why I said that I advice to perform text extraction outside of elasticearch and not store PDFs/Word documents in elasticsearch. – imotov Sep 21 '15 at 19:49