2

In the Elasticsearch Spark/Hadoop documentation, I can read the following option :

es.nodes.wan.only (default : false)

Whether the connector is used against an Elasticsearch instance in a cloud/restricted environment over the WAN, such as Amazon Web Services. In this mode, the connector disables discovery and only connects through the declared es.nodes during all operations, including reads and writes. Note that in this mode, performance is highly affected.

The cloud provider of mine, have put an Ha Proxy on top of Elasticsearch. So, I have to set the previous option to true.

So basically, my understanding of this kind of architecture, is that I have only a single URL endpoint to connect to ES and have some high availability (and load balancing) thanks to Ha Proxy, but on the other hand, it hurt the performance a lot ?

Could you please clarify from your experience, if Ha Proxy on top of Elasticsearch is a good practice (or not) ?

Thank you

Klun
  • 78
  • 2
  • 25

1 Answers1

-1

So basically, my understanding of this kind of architecture, is that I have only a single URL endpoint to connect to ES and have some high availability (and load balancing) thanks to Ha Proxy

Yes, a lot of projects (I worked on) do places HAProxy in front of Elasticsearch, then only send the requests to a single node for loadbalancing.

but on the other hand, it hurt the performance a lot ?

No, it the other way around.
It's like having ELK stack, as logstash's monitoring pipeline will throw errors as it unable to recover killed connections behind a loadbalancer. The fix would be to disable discovery sniffing => false like your cloud provider if you set the option to true.

Could you please clarify from your experience, if Ha Proxy on top of Elasticsearch is a good practice (or not) ?

Yes, having a load balancer in front of ES is definitely a good idea.

LdiCO
  • 577
  • 12
  • 31
  • Hi. I don't work with Logstash nor ELK stack, but with Apache Spark Elasticsearch Hadoop connector. Could you please update your answer, especially regarding `es.nodes.wan.only` to `true` mandatory when HaProxy is used (and documentation that say `performance are highly affected`) ? Thank you – Klun Oct 03 '21 at 13:04