1

Trying to read 9 GB of json data (in multiple files) and loading into ES using spark elastic search connector.

It took more time than expected, got 288 tasks each of writing 32MB and takes around 19s to complete. one of the documents suggested to smaller the chunks of data writing to ES and so I have added these configs to spark config

    conf.set("es.batch.size.bytes","2000000");
    conf.set("es.batch.size.entries","1500");

and I don't see it been reflected when tasks run, cause its still 288 tasks and same 32mb per task. Can someone please help understanding how to use these configs ? Thanks in advance.

0 Answers0