0

I need to join two Rdds from two different ES clusters,but I found I just can create one SparkConf and SparkContext based on one ES cluster. For example the code as following:

var sparkConf: SparkConf = new SparkConf()
sparkConf.set("es.nodes", "192.168.0.22:9200")
val rdd1=sc.esRDD("userIndex1/type1")

So how can I create two RDD from different ES clusters?

Jack
  • 5,540
  • 13
  • 65
  • 113

1 Answers1

2

There is a cfg parameter for esRDD. You can use val rdd1=sc.esRDD("userIndex1/type1", Map("es.nodes" -> "192.168.0.22:9200") to set the configuration.

zsxwing
  • 20,270
  • 4
  • 37
  • 59
  • Great! I did't find it in official doc, how did you get know it please, by reading source code? – Jack May 24 '16 at 21:15
  • 1
    yep. Just took a look at `esRDD` here: https://github.com/elastic/elasticsearch-hadoop/blob/master/spark/core/main/scala/org/elasticsearch/spark/rdd/EsSpark.scala#L23 – zsxwing May 24 '16 at 22:12