I have a remote TIKA server set up and I'm trying to use it from within a RoR application. I need to pull a file from a remote location and send it on to the Tika server. The wiki for TikaJAXRS gives an example using curl, but I have not been able to get that to work. What does work is this:
curl https://mydomain.s3.amazonaws.com/uploads/testdocument.docx | curl -v -i -X PUT -T - ec2...154.uswest2.compute.amazonaws.com:9998/tika
How do I render this in my Rails app using net::http? I've successfully written a GET request with net::http to the Tika server from the Rails app and gotten back the expected result, but the documentation on PUT is a bit sparse. (The server does require a PUT rather than POST.)
BTW, if anyone knows how to make that last example in that wiki work and render it in net::http, that would be even better!
Addendum:
Here's what I have in the RoR app that doesn't work:
ENDPOINT = "http://ec2...154.us-west-2.compute.amazonaws.com:9998"
file = "https://mydomain.s3.amazonaws.com/uploads/testdocument.docx"
uri = URI.parse(endpoint)
@http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Put.new("/tika")
request.body = URI.parse(file).read
@response = @http.request(request)
and I get back a code 415
I need to know how to change this code to do what the curl commands (curl remote_file piped to curl PUT) are doing successfully.
Update
After a couple of days of fruitless attempts on this, I have a workaround:
gem 'curb'
@response = Curl.put("http://ec2...154.us-west-2.compute.amazonaws.com:9998/tika",
Curl.get("https://mydomain.s3.amazonaws.com/uploads/testdocument.docx").body_str)
While this does provide a solution to my immediate problem, I still want to know how to implement this same functionality more directly by using Net::HTTP.