Questions tagged [elastic-map-reduce]

Amazon Elastic MapReduce is a web service that enables the processing of large amounts of data.

Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).

http://aws.amazon.com/elasticmapreduce/

Synonymous tag : emr

452 questions

votes

1 answer

How to enable AWS EMR CloudTrail logging?

We have a team shared AWS account, that sometimes things are hard to debug. Especially, for EMR APIs, throttling happens regularly, that it'll be nice to have CloudTrail logs tell people who is not being nice when using EMR. I think our CloudTrail…

amazon-web-services elastic-map-reduce amazon-cloudtrail

asked Jun 06 '18 at 01:46

Pen2

votes

1 answer

NullPointerException in ObjectMapper in Spark Cluster Mode on AWS EMR

I am getting nullpointer exception in this line (running spark in cluster mode (yarn) on aws emr) but runs fine in client mode (with master as local) Map json = (Map) mapper.readValue(line, Map.class); This is the…

java apache-spark jackson hadoop-yarn elastic-map-reduce

asked Apr 17 '18 at 12:44

blancVector

votes

3 answers

DUMP command in PIG not working

I wrote a simple PIG program as follows to analyze a small and a modified version of the google n-grams dataset on AWS. The data looks something like this: I am 1936 942 90 I am 1945 811 5 I am 1951 47 12 very cool 1923 118 10 very cool 1980 320…

amazon-web-services hadoop apache-pig elastic-map-reduce

asked Mar 28 '18 at 08:12

thegreatcoder

2,173
3
19
28

votes

1 answer

AWS EMR script-runner access error

I'm running emr-5.12.0, with Amazon 2.8.3, Hive 2.3.2, Hue 4.1.0, Livy 0.4.0, Spark 2.2.1 and Zeppelin 0.7.3 on 1 m4.large as my master node and 1 m4.large as core node. I am trying to execute a bootstrap action that configures some parts of the…

amazon-web-services apache-spark emr elastic-map-reduce

asked Mar 15 '18 at 14:45

EspenThaem

votes

1 answer

AWS EMR - Hive creating new table in S3 results in AmazonS3Exception: Bad Request

I have a Hive script I'm running in EMR that is creating a partitioned Parquet table in S3 from a ~40GB gzipped CSV file also stored in S3. The script runs fine for about 4 hours but reaches a point (pretty sure when it is just about done creating…

amazon-web-services amazon-s3 hive amazon-emr elastic-map-reduce

asked Feb 26 '18 at 19:14

Marty

2,104
2
23
42

votes

1 answer

Getting list of EMR Release labels via Amazon API

I need to receive the list of available EMR Release labels in order to run my Java application which starts an EC2 instance and executes a hadoop job. The main problem here that EMR Release labels are specific for each region and I need to get this…

hadoop amazon-ec2 amazon-emr elastic-map-reduce

asked Jan 19 '18 at 16:56

John Doe

votes

1 answer

Parquet Data Ingestion in Druid Error in Timestamp parsing using Joda

Context: I am able to submit a MapReduce job from druid overlord to an EMR. My Data source is in S3 in Parquet format. The timestamp field value is in format "2017-09-01 21:14:11:552 IST". Error is while parsing the timestamp Issue Stack trace is:…

java jodatime elastic-map-reduce druid

asked Jan 19 '18 at 07:35

Shiva Achari

votes

1 answer

How to subtract in Map Reduce paradigm

I have the following dataset s1, s2, count 1, 2, x1 1, 3, x2 1, 4, x3 2, 1, y1 2, 3, y2 2, 4, y3 3, 1, z1 3, 2, z2 I want to get the following output s1, s2, count 1, 2, x1-y1 1, 3, x2-z1 1, 4, x3 2, 3, y2-z2 2, 4, y3 The idea is that s1 is an…

hadoop mapreduce distributed-computing emr elastic-map-reduce

asked Oct 25 '17 at 15:40

Vikram Garg

1,329
1
8
8

votes

1 answer

Query DynamoDB Data with EMR

I am looking for a way to query the AWS DynamoDB data with SQL Syntax using amazon EMR. I have my DynamoDB table set up and ready. How can I import/query the data using Hue? The table in DynamoDB has a size of around 8GB.

amazon-web-services amazon-dynamodb elastic-map-reduce

asked Oct 12 '17 at 16:15

Hendrik

4,849
7
46
51

votes

1 answer

Multiple Filtering in PySpark

I have imported a data set into Juputer notebook / PySpark to process through EMR, for example: data sample I want to clean up the data before using it using the filter function. This includes: Removing rows that are blank or '0' or NA cost or…

python pyspark elastic-map-reduce

asked Oct 05 '17 at 13:42

lseactuary

votes

1 answer

Lower case response from elastic search where as upper case is expected

I am trying to fetch data using elastic search with java using method .addAggregation(terms(term)) The JSON response that I am expecting is { "key" : "TEST" } but I am getting the response as { "key" : "test" } which is in lower case, I…

elasticsearch elasticsearch-plugin elastic-map-reduce

asked Sep 27 '17 at 05:39

user2681668

votes

0 answers

Identical code works in pyspark shell but not via spark-submit

So I have a Pyspark project in the following structure: main.py: doing the real stuff (imports pyspark udf's from utils.py and stuff from common.py) utils.py: some utility functions (imports from common.py) common.py: some params Inside a Pyspark…

apache-spark pyspark elastic-map-reduce

asked Mar 09 '17 at 03:49

Rex911

votes

2 answers

Elasticsearch to query across multiple indices and multiple types

I am newbie to elasticsearch .I am using AWS elastic search instance 5.1.1. I have a requirement where I need to specify multiple indices and types in request body of Elasticsearch for search operation ,is it possible ? What is the simplest way to…

elasticsearch elastic-map-reduce spring-data-elasticsearch

asked Mar 06 '17 at 18:41

SSG

1,265
2
17
29

votes

1 answer

AWS Elasticsearch : URL encoding for search across multiple indices and types

I am using AWS elasticsearch and using AWS signature V4 to communicate with the instance. Simple queries to create/search indexes are working fine. But I want to have a functionality where I should be able to search across multiple indices and…

amazon-web-services elasticsearch elastic-map-reduce

asked Mar 06 '17 at 13:25

MMT

votes

0 answers

I want to perform partial match or exact match in Elastic search

Suppose we have two entries for index "Phones" 1]iphone 6 2]iphone 7 1]if I search for "iphone 6" Exact match will have one record 2]if I search for "iphone" Partial match will have both record So I want to toggle between above methods based on…

elasticsearch elastic-map-reduce

asked Mar 01 '17 at 08:57

MMT

Prev 1 2 3

…

30 31 Next