I have an AWS ElasticSearch cluster and I have created an index on it. I want to upload 1 million documents in that index. I am using Python package elasticsearch version 6.0.0 for doing so.
My payload structure is similar to this -
{
"a":1,
"b":2,
"a_info":{
"id":1,
"name":"Test_a"
},
"b_info":{
"id":1,
"name":"Test_b"
}
}
After discussion in the comment section, I realise that total number of fields in a document also includes its subfields. So in my case, total number of fields in each document goes to 60 in count.
I have tried the following methods -
- Using Bulk() interface as described in the documentation(https://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.bulk).
The error that I received using this method are -
- Timeout response after waiting for ~10-20 min.
In this method, I have also tried uploading documents in batch of 100 but still getting timeout.
- I have also tried adding documents one by one as per documentation(https://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.create) This method takes a lot of time to create upload even one document. Also, I am getting this error for few of the documents -
TransportError(500, u'timeout_exception', u'Failed to acknowledge mapping update within [30s]')
My index settings are these -
{"Test":{"settings":{"index":{"mapping":{"total_fields":{"limit":"200000000"}},"number_of_shards":"5","provided_name":"Test","creation_date":"1557835068058","number_of_replicas":"1","uuid":"LiaKPAAoRFO6zWu5pc7WDQ","version":{"created":"6050499"}}}}}
I am new to ElasticSearch Domain. How can I upload my documents to AWS ES Cluster in a fast manner?