Questions tagged [elasticsearch-hadoop]

Elasticsearch real-time search and analytics natively integrated with Hadoop. Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.

Elasticsearch real-time search and analytics natively integrated with Hadoop.

Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.

Requirements

Elasticsearch (0.9X series or 1.0.0 or higher (highly recommended)) cluster accessible through REST. That's it! Significant effort has been invested to create a small, dependency-free, self-contained jar that can be downloaded and put to use without any dependencies. Simply make it available to your job classpath and you're set. For a certain library, see the dedicated chapter.

Documentation

109 questions

vote

2 answers

Spark Runtime Error - ClassDefNotFound: SparkConf

After installing and building Apache Spark (albeit with quite a few warnings), the compilation of our Spark application (using "sbt package") completes successfully. However, when trying to run our application using the spark-submit script, a…

asked Jul 01 '15 at 21:15

kgrimes2

vote

0 answers

Jackson error in ElasticSearch Hadoop while loading data to ElasticSearch

I am trying to load data from HDFS to ElasticSearch using elasticsearch-hadoop version elasticsearch-hadoop-2.1.0.Beta3.jar. There was a bug on Mapr: https://github.com/elastic/elasticsearch-hadoop/issues/215 which was supposed to fix the jackson…

jackson hive apache-pig elasticsearch-hadoop

asked Apr 17 '15 at 12:56

Sibish

vote

1 answer

Elasticsearch-Hadoop get Non-indexed data

I have an elasticsearch cluster which has big amount of data. I want to extract all data from elasticsearch into Hadoop(Hive). I used Elasticsearch-Hadoop driver in order to extract data from elasticsearch by using Hive external table but it is too…

hadoop elasticsearch hadoop-streaming elastic-map-reduce elasticsearch-hadoop

asked Mar 13 '15 at 15:45

Yusuf Can Gürkan

votes

0 answers

ElasticSearchHadoop throwing unauthorised exception

We are upgrading from Elastic 6.3 to 7.8 version. We are using- elastic Hadoop to upload the data in elastic index using scala spark. We are getting unauthorized exception while uploading the data. The same code working fine with 6.3 version. The…

elasticsearch-hadoop

asked May 29 '23 at 13:52

Sunil Sharma

votes

0 answers

An error occurred when using hive to query the es

I created an Hive external table to query the existing data of es like below CREATE EXTERNAL TABLE ods_es_data_inc (`agent_id` STRING, `dt_server_time` TIMESTAMP ) COMMENT 'bb_i_app' STORED BY…

elasticsearch hadoop hive elasticsearch-hadoop

asked Feb 14 '23 at 08:59

Sam

votes

0 answers

How to save pyspark DataFrame to Elasticsearch (Running on Docker) using elastisearch-hadoop

I am trying to write a pyspark DataFrame to an Elasticsearch instance running on Docker. I am unable to successfully connect to the Elasticsearch instance using elasticsearch-hadoop. When I try to save the DataFrame, I get an error that…

elasticsearch pyspark elasticsearch-hadoop elasticsearch-spark

asked Dec 22 '22 at 00:22

mondal.alex

votes

0 answers

elasticsearch hadoop cannot parse value [] for field

There is a double field in the index I use that is empty. When I use elasticsearch-spark-30_ 2.12-7.17.2.jar reading the index, the exception EsHadoopParsingException: Cannot parse value [] for field [X] will be thrown, but when I replace the…

apache-spark elasticsearch-hadoop

asked Nov 30 '22 at 07:35

lucien

votes

1 answer

ElasticSearch hive SerializationError handler

Using Elastic search version 6.8.0 hive> select * from…

elasticsearch hadoop hive elasticsearch-hadoop

asked Nov 01 '22 at 14:38

Syed Rafi

votes

1 answer

Hive to Elastic search ingestion issues

Using Elastic search version 6.8.0 Complete Hive Job gets failed for a single malformed json record, I tried changing the 'es.write.rest.error.handler.es.return.default'='PASS/HANDLED' But no luck Refer :…

elasticsearch hive elasticsearch-hadoop

asked Oct 28 '22 at 08:16

Syed Rafi

votes

1 answer

Reading an Elasticsearch Index from PySpark

Could anyone tell me why this test script for PySpark errors out? (python 3.6.8, hadoop 3.3.1, spark 3.2.1, elasticsearch-hadoop 7.14) from pyspark.sql import SparkSession, SQLContext myspark = SparkSession.builder \ .appName("My test.") \ …

apache-spark elasticsearch pyspark elasticsearch-hadoop

asked May 18 '22 at 16:26

Antinomial

votes

1 answer

EsHadoopIllegalArgumentException: invalid map received dynamic=strict errors on elasticsearch-hadoop

trying with both the dataframe Api and the rdd API val map =collection.mutable.Map[String, String]() map("es.nodes.wan.only") = "true" map("es.port") = "reducted" map("es.net.http.auth.user") = "reducted" map("es.net.http.auth.pass") =…

apache-spark elasticsearch elasticsearch-hadoop

asked Dec 29 '21 at 14:22

alonisser

11,542
21
85
139

votes

1 answer

Spark 3.0 scala.None$ is not a valid external type for schema of string

While using elasticsearch-hadoop library for reading elasticsearch index with empty attribute, getting the exception Caused by: java.lang.RuntimeException: scala.None$ is not a valid external type for schema of string There is open defect in github…

scala apache-spark elasticsearch elasticsearch-hadoop

asked Apr 30 '21 at 05:45

Shivaji Mutkule

1,020
1
15
28

votes

1 answer

Invalid timestamp when reading Elasticsearch records with Spark

I'm getting invalid timestamp when reading Elasticsearch records using Spark with elasticsearch-hadoop library. I'm using following Spark code for records reading: val sc = spark.sqlContext val elasticFields = Seq( "start_time", "action", …

apache-spark elasticsearch elasticsearch-hadoop

asked Jan 23 '21 at 11:22

Jacfal

votes

3 answers

Elasticsearch pyspark connection in insecure mode

My end goal is to insert data from hdfs to elasticsearch but the issue i am facing is the connectivity I am able to connect to my elasticsearch node using below curl command curl -u username -X GET https://xx.xxx.xx.xxx:9200/_cat/indices?v'…

apache-spark elasticsearch curl pyspark elasticsearch-hadoop

asked Aug 10 '20 at 11:57

Ayush Goyal

votes

1 answer

How to write dataframe with struct column into Elasticsearch via PySpark

I'm trying to write a dataframe containing struct column into Elasticsearch: df1 = spark.createDataFrame([{"date": "2020.04.10","approach": "test", "outlier_score": 1, "a":"1","b":2}, {"date": "2020.04.10","approach": "test",…

elasticsearch pyspark elasticsearch-hadoop elasticsearch-spark

asked May 11 '20 at 13:09

Andrey Sapegin

Prev 1 2 3 4

6 7 8 Next