Questions tagged [elasticsearch-hadoop]

Elasticsearch real-time search and analytics natively integrated with Hadoop. Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.

Elasticsearch real-time search and analytics natively integrated with Hadoop.

Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.

Requirements

Elasticsearch (0.9X series or 1.0.0 or higher (highly recommended)) cluster accessible through REST. That's it! Significant effort has been invested to create a small, dependency-free, self-contained jar that can be downloaded and put to use without any dependencies. Simply make it available to your job classpath and you're set. For a certain library, see the dedicated chapter.

Documentation

109 questions

votes

0 answers

Elasticsearch-Hadoop connector for Spark Dataframe

I am trying to write a spark dataframe to Elasticsearch as follows: df.write.format("es").save("db/test") Unfortunately, I receive the following error: Py4JJavaError: An error occurred while calling o50.save.: org.apache.spark.SparkException: Job…

apache-spark-sql elasticsearch-hadoop

asked Jul 17 '17 at 20:41

Stijn

votes

2 answers

Indexing tuples from storm to elasticsearch with elasticsearch-hadoop library does not work

I want to index documents into Elasticsearch from Storm, but I couldn't get any document to be indexed into Elasticsearch. In my topology I have a KafkaSpout that emits a json like this { “tweetId”: 1, “text”: “hello” } to a EsBolt that is a native…

elasticsearch apache-storm elasticsearch-hadoop

asked Apr 16 '16 at 09:03

Emanuel Barac

votes

1 answer

What is the Best way to insert Entries into ElasticSearch?

I am new to ElasticSearch and I have a file of 180 fields and 12 million lines. I have created an index and type in ElasticSearch and Java Program but it takes 1.5 hrs. Is there any other best way to to load data into ElasticSearch with reduced…

java elasticsearch elasticsearch-hadoop

asked Jan 11 '16 at 14:00

Jerin J

votes

1 answer

Build failure while building a project using ElasticSearch-Hadoop

I am unable to build a Java project which uses ElasticSearch-Hadoop. This is the error that I am seeing, when I try to build my project: Scanning for projects... ------------------------------------------------------------------------ Building…

java maven hadoop elasticsearch elasticsearch-hadoop

asked Sep 08 '14 at 06:19

user1882391

votes

1 answer

Hivesever2 unable to load EsStorageHandler class from elasticsearch-hadoop

I have this configuration in hive-site.xml hive.aux.jars.path /path/to/elasticsearch-hadoop-2.0.1.jar When I map data to Elasticsearch in HiveCli, it work correctly by this code: CREATE…

java hadoop elasticsearch hive elasticsearch-hadoop

asked Aug 27 '14 at 09:38

thanhtien

votes

1 answer

Elasticsearch best practices : it is a good idea to implement Ha Proxy in front of Elasticsearch 7?

In the Elasticsearch Spark/Hadoop documentation, I can read the following option : es.nodes.wan.only (default : false) Whether the connector is used against an Elasticsearch instance in a cloud/restricted environment over the WAN, such as Amazon…

elasticsearch elasticsearch-hadoop

asked Sep 24 '21 at 20:13

Klun

votes

0 answers

Spark UI stuck while attempting to create Dynamic Dataframes

I am using Spark (2.2.0) with ElasticSeach Hadoop (7.6.0) The purpose of my Spark Job is process records from a parquet file, and append it by unique to documents already present in ElasticSearch. Since ElasticSearch doesn't support updates, the…

scala apache-spark elasticsearch apache-spark-sql elasticsearch-hadoop

asked May 21 '20 at 20:38

gunj_desai

votes

1 answer

Data load from HDFS to ES taking very long time

I have created an external table in hive and need to move the data to ES (of 2 nodes, each with 1 TB). Below regular query taking very long time (more than 6 hours) for a source table with 9GB of data. INSERT INTO TABLE…

elasticsearch hive query-optimization elasticsearch-hadoop

asked Mar 12 '19 at 17:16

RAVITEJA SATYAVADA

2,503
23
56
88

votes

1 answer

How to reindex data from one Elasticsearch cluster to another with elasticsearch-hadoop in Spark

I have two separated Elasticsearch clusters, I want to reindex the data from the first cluster to the second cluster, but I found that I can only setup one Elasticsearch cluster inside SparkContext configuration, such as: var sparkConf : SparkConf =…

scala elasticsearch apache-spark apache-spark-sql elasticsearch-hadoop

asked Oct 29 '16 at 02:36

Jack

5,540
13
65
113

votes

1 answer

Apache Spark: JOINing RDDs (data sets) using custom criteria/fuzzy matching

Is it possible to join two (Pair)RDDs (or Datasets/DataFrames) (on multiple fields) using some "custom criteria"/fuzzy matching, e.g. range/interval for numbers or dates and various "distance methods", e.g. Levenshtein, for strings? For "grouping"…

java apache-spark levenshtein-distance fuzzy-comparison elasticsearch-hadoop

asked Sep 01 '16 at 12:18

Morten Garbøl Franck

votes

1 answer

Spark (Java) to Elasticsearch

I am testing to load data from a csv to spark then save it in Elasticsearch but I am having some trouble on saving my RDD collection in Elasticsearch using spark. This error is raised when submitting job: Exception in thread "main"…

java maven elasticsearch apache-spark elasticsearch-hadoop

asked Jun 30 '16 at 09:08

kulssaka

votes

2 answers

Spark-Cassandra Vs Spark-Elasticsearch

I have been using Elasticsearch for quite sometime now and little experience using Cassandra. Now, I have a project we want to use spark to process the data but I need to decide if we should use Cassandra or Elasticsearch as the datastore to load my…

apache-spark elasticsearch cassandra spark-cassandra-connector elasticsearch-hadoop

asked Aug 28 '15 at 20:53

Philip K. Adetiloye

3,102
4
37
63

votes

2 answers

Is JOIN operation possible in ElasticSearch using any ES Connector for presto or Hive (ElasticSearch-Hadoop)?

As we know that JOIN operation is not possible in ElasticSearch among indices, Can it be achieved using Presto or Hive, i.e. can we do a JOIN operation using any ElasticSearch Connector for Presto or Hive ? Can we do JOIN in ElasticSearch using…

join elasticsearch hive presto elasticsearch-hadoop

asked May 31 '15 at 11:41

sumanth232

votes

1 answer

Writing Hadoop reduce output to Elasticsearch

I'm having a bit of trouble understanding how to write the output of a simple Hadoop back into Elasticsearch. Job is configured…

java hadoop elasticsearch elasticsearch-hadoop

asked Aug 15 '14 at 15:11

Eddy

1,662
2
21
36

vote

0 answers

es.read.source.filter v.s. es.read.field.include when reading data with elasticsearch-hadoop

When reading data from Elasticsearch with elasticsearch-hadoop, there are two options two specify how to reading a subset of fields from the source, according to the offical documents, i.e,. es.read.field.include: Fields/properties that are parsed…

apache-spark elasticsearch pyspark elasticsearch-hadoop

asked Dec 28 '21 at 09:46

Gary Wang

Prev 1

3 4 5 6 7 8 Next