2

In my script, I have the following:

response = requests.get(list_url[0], allow_redirects=True)
s = io.BytesIO()
s.write(response.content)
s.seek(0)
mimetype="application/octet-stream"
document = {'file': s.read(), 'mime': mimetype}
request = {"name": name, "document": document}

However, when I send a request to the server:

result = client.process_document(request=request)

I get ValueError: Protocol message Document has no "file" field.

Is this due because google docAI doesn't accept octet-stream?

An old man in the sea.
  • 1,169
  • 1
  • 13
  • 30

1 Answers1

0

I checked the latest version code of the document ai python client DocumentProcessorServiceClient and found this function pass on its request field a Process Request object. You can check details of that function on the process_document github code page.

Process Request will accept either inline_document or a raw_document (both are mutual exclusive). Based on your code it looks like you are passing a raw_document which only accepts fields content and mime_type which should be used instead of file and mime.

If you check the sample of using python library client for document ai you will find this lines which explain how it should be implemented:

...
    document = {"content": image_content, "mime_type": "application/pdf"}

    # Configure the process request
    request = {"name": name, "raw_document": document}

    result = client.process_document(request=request)
...

For additional details, you can check the official github project for document ai and the official google page for the python client library.

Betjens
  • 1,353
  • 2
  • 4
  • 13