Questions tagged [elasticsearch-spark]

37 questions
0
votes
1 answer

How to transform array of JSONs to rows before writeStream to Elasticsearch?

Follow-up to this question I have JSON streaming data in the format same as below | A | B | |-------|------------------------------------------| | ABC | [{C:1, D:1}, {C:2, D:4}] | | XYZ …
0
votes
2 answers

Exception-"network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance"

I have tried running the Spark application for integration of Hbase and ES. I have tried creating the index in ES and storing the data from HBase, but received an issue “ the user is unauthorized or access denied” when connecting to ES server. I…
pavan
  • 43
  • 1
  • 6
0
votes
1 answer

Retrieve metrics from elasticsearch-spark

At the end of an ETL Cascading job, I am extracting metrics about the Elasticsearch ingestion using Hadoop metrics that elasticsearch-hadoop exposes using Hadoop counters. I want to do the same using Spark, but I don't find documentation related to…
0
votes
2 answers

Merge documents in elasticsearch haoop, create multi key value pairs using es-sparksql

Currently elasticsearch hadoop is converting dataset/rdd to documents with 1 to 1 mapping i.e. 1 row in dataset is converted to one doc. In our scenario we are doing something like this for 'uni PUT…
rohit
  • 862
  • 12
  • 26
0
votes
1 answer

Read from Elasticsearch with Spark getting precise fields

I'm very new to ElasticSearch: I am trying to read data from an index using Spark in Java. I have a working piece of code, but it returns the document inside a Dataset where columns are only the two "root" elements of the doc, while all the…
ercaran
  • 23
  • 1
  • 1
  • 8
0
votes
1 answer

Elasticsearch hadoop configure bulk batch size

I read through possibly Stackoverflow that es-hadoop / es-spark projects use bulk indexing. If it does is the default batchsize is as per BulkProcessor(5Mb). Is there any configuration to change this. I am using…
rohit
  • 862
  • 12
  • 26
0
votes
1 answer

Elasticsearch 5.0 and Elasticsearch-Spark connector - what is correct maven artefact

When writing application to run on Apache Spark 1.6 using Elasticsearch-Spark connector, documentation at (https://www.elastic.co/guide/en/elasticsearch/hadoop/5.0/install.html#_minimalistic_binaries) says to use maven artefact
Vladimir Kroz
  • 5,237
  • 6
  • 39
  • 50
1 2
3