Questions tagged [elasticsearch-hadoop]

Elasticsearch real-time search and analytics natively integrated with Hadoop. Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.

Elasticsearch real-time search and analytics natively integrated with Hadoop.

Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.

Requirements

Elasticsearch (0.9X series or 1.0.0 or higher (highly recommended)) cluster accessible through REST. That's it! Significant effort has been invested to create a small, dependency-free, self-contained jar that can be downloaded and put to use without any dependencies. Simply make it available to your job classpath and you're set. For a certain library, see the dedicated chapter.

Documentation

109 questions

vote

0 answers

Unable to create external table in elasticsearch using es-hadoop

i am running a simple spark-submit job, e.g.: enter code here spark-submit --class com.x.y.z.logan /home/test/spark/sample.jar table in jar file hiveContext.sql("CREATE TABLE IF NOT EXISTS databasename.tablename(es_column_name STRING)…

asked Jan 25 '17 at 07:42

Ganga Dhar

vote

1 answer

Pyspark converting rdd to dataframe with nulls

I am using pyspark (1.6) and elasticsearch-hadoop (5.1.1). I am getting my data from elasticsearch into a rdd format via: es_rdd = sc.newAPIHadoopRDD( …

python pyspark elasticsearch-hadoop

asked Jan 13 '17 at 12:02

wrdeman

vote

1 answer

How to fix an error when an empty string is being written to elastic search from an Apache Spark job?

There is an exception being thrown when I execute my Scala app with functionality of myRDD.saveToEs (I also tried saveToEs from a dataframe). My ES version is 2.3.5. I am using Spark 1.5.0 so maybe there is a way to configure this in the…

scala elasticsearch apache-spark elasticsearch-hadoop

asked Aug 23 '16 at 02:43

ZeroGraviti

1,047
2
12
28

vote

1 answer

Spark Web UI "take at SerDeUtil.scala:201" interpretation

I am creating a Spark RDD by loading data from Elasticsearch using the elasticsearch-hadoop connector in python (importing pyspark) as: es_cluster_read_conf = { "es.nodes" : "XXX", "es.port" : "XXX", "es.resource" :…

apache-spark pyspark elasticsearch-hadoop

asked Jul 29 '16 at 18:10

Manav Garg

vote

1 answer

How to search multiple indices using elasticsearch hadoop

Suppose the following senario: We have following indices index-1,index-2,index-4, yes for some reason 'index-3' was missed, by I didn't know that during search time, so i'd like to search a index pattern like "index-1,index-2,index-3,index-4", in…

hadoop elasticsearch elasticsearch-hadoop

asked Jun 04 '16 at 10:59

Qichu Gong

vote

1 answer

How to set es.nodes parameter to multiple Elasticsearch nodes for Spark ?

So I want to read data from multiple Elasticsearch nodes into Spark. I prefer to use the "es.nodes" parameter and set "es.nodes.discovery" to false. The configuration parameters are described here. I tried to find some example on how to set…

elasticsearch apache-spark elasticsearch-hadoop

asked May 23 '16 at 08:34

ZianyD

vote

1 answer

Mapping field names of the output from Spark-Streaming to Elastic Search

I am using the following code to store the output of Spark-Streaming to ElasticSearch. I want to map the output of spark-streaming to the proper name i.e (Key, OsName, PlatFormName, Mobile, BrowserName, Count). But as you can see currently it is…

apache-spark spark-streaming elasticsearch-hadoop

asked May 19 '16 at 07:33

Naresh

5,073
12
67
124

vote

0 answers

Truncate elastic search hive tables

I am using Elasticsearch Hive integration, so that I can query from Hadoop tables, sending alerts when data is bad (with ElastAlert), as well as display on Kibana. This is how I created the Elastic table: CREATE EXTERNAL TABLE my_elastic_table ( …

elasticsearch hive elasticsearch-hadoop

asked Apr 27 '16 at 21:01

yuan0122

vote

1 answer

Issue when writing to elasticsearch using es-hadoop

Am getting this exception when I'm trying to write to Elasticsearch using mapreduce program with es-hadoop. Am trying to write to index=employee and type=basic which already exists in my Elasticsearch cluster. My stack trace :- Exception in thread…

hadoop elasticsearch elasticsearch-hadoop

asked Apr 13 '16 at 07:29

Sachin

1,675
2
19
42

vote

1 answer

Is there a way to apply multiple groupings in storm?

I want to apply "Fields grouping" as well as "Local or shuffle grouping" to my topology such that each spout sends data to local bolts only but also uses a field in my document to decide what local-bolts it should go to. So if there were two worker…

apache-storm elasticsearch-hadoop

asked Apr 02 '16 at 01:33

user2250246

3,807
5
43
71

vote

0 answers

Bind Elastic-Search to localhost as well as an IP address

modules-network in Elastic-Search documentation says that it can bind to more than one network addresses by specifying an array of IP addresses in network.bind_host I put the following in my config/elasticsearch.yaml: # Used a real IP address in the…

elasticsearch apache-storm elasticsearch-hadoop

asked Mar 19 '16 at 02:39

user2250246

3,807
5
43
71

vote

1 answer

FAILED: SemanticException Cannot find class 'org.elasticsearch.hadoop.hive.ESStorageHandler'

I am following https://gist.github.com/costin/8025827 example not sure why am getting this error. Any response is highly appreciated. hive> ADD JAR hdfs:///auxlib/elasticsearch-hadoop-2.2.0.jar ; converting to…

elasticsearch hive hadoop2 elasticsearch-hadoop

asked Mar 09 '16 at 01:15

Fastdata Bigdata

vote

1 answer

Writing json from HDFS to Elasticsearch using elasticsearch-hadoop map-reduce

We have some json data stored into HDFS and we are trying to use elasticsearch-hadoop map reduce to ingest data into Elasticsearch. The code we used is very simple (below) public class TestOneFileJob extends Configured implements Tool { public…

java hadoop elasticsearch mapreduce elasticsearch-hadoop

asked Dec 01 '15 at 11:46

Fanooos

2,718
5
31
55

vote

1 answer

how to index json to elasticsearch using hadoop map-reduce and es-hadoop?

I have huge set of data stored in HDFS which we want to index into Elasticsearch. The trivial thinking is to use Elasticsearch-hadoop library. I followed the concept in this video and here is the code I wrote for this job. public class…

json hadoop elasticsearch mapreduce elasticsearch-hadoop

asked Nov 23 '15 at 08:05

Fanooos

2,718
5
31
55

vote

1 answer

Spark machine learning and Elasticsearch analyzed tokens/text in Python

I'm trying to build an application that indexes a bunch of documents in Elasticsearch and retrieves the documents through Boolean queries into Spark for machine learning. I'm trying to do this all through Python through pySpark and…

elasticsearch apache-spark elasticsearch-hadoop elasticsearch-py

asked Aug 24 '15 at 23:04

plam

1,305
3
15
24

Prev 1 2 3

5 6 7 8 Next