0

I am trying to import many thousands of records into Arango. I am attempting to use the batch/bulk import feature of Arango described at: https://docs.arangodb.com/3.0/HTTP/BatchRequest/index.html to do a combination of PUT and POST requests to either insert new records, or update existing records if they already exist. My end solution needs to run from a Python script, presumably using pyArango. I have created a sample HTTP request

POST http://<arango_server>:8529/_db/myDB/_api/batch

that looks something like the following:

Content-Type: multipart/form-data; boundary=P1X7QNCB
Content-Length: <calculated by python or REST Client>
Authorization: Basic <calculated by python requests session or REST Client>

--P1X7QNCB
Content-type: application/x-arango-batchpart
Content-Id: 1

POST /_api/document/model/foo HTTP/1.1


{"data": "bar"}
--P1X7QNCB

I have not been able to get this to process successfully in Arango. I have tried using python similar to the following (that generates the above request, even if my approximation of the code below has typos):

url = "/_api/document/" + collection + "/" + nodeKey + " HTTP/1.1"
postString = ("--P1X7QNCB\r\n"
              "Content-type: application/x-arango-batchpart\r\n"
              "Content-Id: " + str(counter) +  "\r\n"
              "\r\n"
              "\r\n"
              "PUT " + url+ "\r\n\r\n\r\n" + json.dumps(nodeData) + "\r\n")
batchHeaders = {"Content-Type": "multipart/form-data; boundary=P1X7QNCB"}
response = self.db.connection.session.post(self.db.URL + "/batch", data=postString, headers=batchHeaders)

and using a REST client where I manually post the content. In both cases I get the following response back:

{"error":true,"errorMessage":"invalid multipart message received","code":400,"errorNum":400}

And the following is logged in the arango log file:

WARNING received a corrupted multipart message

Is it obvious to anyone what I am doing wrong, or where I can look for more details on why ArangoDB is rejecting the requests?

Thanks!

1 Answers1

1

ArangoDB will throw this error when it tries to extract the next part of a multipart mime container and fails to.

You should inspect your boundary strings, and check that the last string properly terminates the container with two trailing dashes (--)

NGrep or Wireshark tend to be very usefull to inspect whats really sent by programs - it may sometimes not be what you think - or even get samples how to do it from other programs.

dothebart
  • 5,972
  • 16
  • 40
  • Thanks. I missed the terminating "--" on the final boundary string. However, with that fixed Arango now accepts the multipart request with a responst 200, but the individual multipart responses are all: {"error":true,"errorMessage":"'METHOD' not implemented","code":501,"errorNum":9}. Is there something that needs to be done to enable "batch" requests? I am running with ArangoDB 3.0.6 on linux. – muddlednbefuddled Sep 30 '16 at 17:03
  • Update: I got the batch post working. My second problem was the double space between the multipart header and the POST/PUT commands. The arangod multipart parser appears to require a single space and expects to find the 'METHOD' on the second line after the header. – muddlednbefuddled Sep 30 '16 at 18:02