0

We're using Solr and Tika to search external data such as PDF and Docs. However with this we're getting only the raw text without the formatting. We would like to get also the formatting and meta data such as captions and bullets. Is there any way to get it?

Thank you, Moshe

Moshe
  • 208
  • 4
  • 13
  • 1
    Hi Moshe, it would be great if you could highlight what you have already tried. – Patrick Geyer Apr 22 '14 at 08:43
  • We tried using EmbeddedResourceHandler and get the internal data within the document, however without success. If you've some example it'll be great – Moshe May 21 '14 at 06:50

0 Answers0