I'm trying to send a PDF for content extraction to a Tika Server but always get the error: "Cannot convert text from stream using the source encoding"
This is how Tika is expecting the files:
"All services that take files use HTTP "PUT" requests. When "PUT" is used, the original file must be sent in request body without any additional encoding (do not use multipart/form-data or other containers)." Source https://wiki.apache.org/tika/TikaJAXRS#Services
What is the correct way of sendig the file with XMLHttpRequest()?
Code:
var response, error, file, blob, xhr;
file = new File("/PROJECT/web/dateien/ai/pdf.pdf");
blob = file.toBuffer().toBlob("application/pdf");
url = "http://localhost:9998/tika";
// send data
try {
xhr = new XMLHttpRequest();
xhr.open("PUT", url);
xhr.setRequestHeader("Accept", "text/plain");
xhr.send(blob);
} catch (e) {
error = e;
}
({
response: xhr.responseText,
status: xhr.statusText,
error: error,
type: xhr.responseType,
blob: blob
});
Error: