1

I have an Elasticsearch Service on AWS I would like to access from Spark using elasticsearch-spark using a node-to-node configuration, so Spark workers can connect to elasticsearch nodes parallelly. However, Amazon only provides one endpoint to access the cluster.

So far, the only way I have managed to connect to the service from Spark is by setting

es.nodes.wan.only = true

which disables node discovery and uses the only address I have, connecting only to one node in the Elasticsearch cluster, which is the exact opposite of what I want.

Is there a way to allow connecting to multiple nodes using Amazon's Elasticsearch Service?

ami232
  • 55
  • 8
  • I don't think that this is possible for the time being out of the box. Other solutions may require lots of expertise sharing resources between spark/elasticsearch on the same nodes 1-to-1, then snapshotting into S3 a load snapshot on service. That's lots of work thought... – eliasah Jun 19 '17 at 12:29

0 Answers0