0

planning to use Apache Tika Server 2.5 in .net6. how can we use that and call from .net component.

BChe
  • 13
  • 4
  • Install java, start the server, make REST calls to it? – Gagravarr Oct 10 '22 at 12:48
  • Have downloaded Apache tika standard 2.5 and started server using below command > Java -jar tika-server-standard-2.5.0.jar and started browsing endpoints but few are not accessible i.e. tika/form and rmeta/form – BChe Oct 10 '22 at 13:20
  • @Gagravarr want to send multiple file formats and parse at once in API call but unable to do it. – BChe Oct 10 '22 at 13:39
  • See https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-TikaServerServices for the list of supported endpoints, and if they are GET / POST / PUT – Gagravarr Oct 10 '22 at 14:34
  • @Gagravarr Thanks for reply..yes we have endpoints Tika/form and rmeta/form listed on confluence but when we start the server and browse url http://localhost:9998/ and then select these endpoints by clicking on them then it shows this page isn't working right now and exception we get is HTTP 405 method not allowed. – BChe Oct 10 '22 at 15:04
  • As documented on the Wiki and in the App, those methods require a POST, so a GET won't work – Gagravarr Oct 10 '22 at 21:54
  • @Gagravarr making post call but getting error - Problem with writing the data, class org.Apache.tika.server.core. resource, contenttype: text/xml – BChe Oct 11 '22 at 05:46
  • @Gagravarr in case you have reference document or snippet to call tika server from .net using multipart.that helps. – BChe Oct 11 '22 at 05:50
  • I don't code in .net, but you can find a bunch of Java ones which should be pretty similar in the Apache Tika test suite, eg https://github.com/apache/tika/blob/main/tika-server/tika-server-standard/src/test/java/org/apache/tika/server/standard/TikaResourceTest.java#L533 – Gagravarr Oct 11 '22 at 10:00
  • @Gagravarr Does tika keeps parent child relationship in document and attachments – BChe Oct 12 '22 at 04:24
  • Depends on what API you call - some will skip attachments, some will inline, some will return you a bundle with references. Up to you what one works best in your situation – Gagravarr Oct 12 '22 at 09:17
  • @Gagravarr for multipart support when I was trying to call post Api endpoint tika/form or rmeta/form by attaching 2 documents it's only accept 1St document and parse. – BChe Oct 13 '22 at 09:20
  • @Gagravarr i think we cannot pass multiple files in single request to tika/form or rmeta/form endpoints – BChe Oct 14 '22 at 11:58
  • You'd be better off asking Tika usage questions on the user mailing list, lots more people monitor that - https://tika.apache.org/mail-lists.html – Gagravarr Oct 14 '22 at 13:09

1 Answers1

0

This is how it's worked for me

Var Client=new Httpclient();     
string uriString="Your API endpoint Address here";
Using var stream=File.OpenRead(filename); Using var content= new MultiFormDataContent{{ new StreamContent (stream)}};  
HttpResponseMesage response=client.PostAsync(uriString,content).Result; 
Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
BChe
  • 13
  • 4