Highest Voted 'hadoop-partitioning' Questions

0

votes

1 answer

Hadoop Map task/Map object

As per theory following properties are to define number of map/red task slots at data node. mapred.tasktracker.map.tasks.maximum | mapred.map.tasks. Also, number of mapper objects is decided by the number of input splits in the MapReduce job. We…

asked Apr 21 '14 at 15:26

user3159843

1
1

0

votes

1 answer

HDInsight Azure Blob Storage Data Update

I am considering HDInsight with Hive and data loaded on Azure Blob Storage. There is a combination of both historic and changing data. Does the solution mentioned in Update , SET option in Hive work with blob storage too? The below Hive statement…

hadoop hive azure-hdinsight hadoop-partitioning azure-blob-storage

asked Mar 26 '14 at 10:02

Srinivas

2,479
8
47
69

0

votes

1 answer

hadoop mapreduce partitioner not invoked

I need help with mapreduce job, my custom partitioner is never invoked. I checked everything million times, but no result. It used to work a while ago, I have no idea why now it isn't. Any help would be very appreicated. I am adding the code (It…

hadoop mapreduce partitioning hadoop-partitioning partitioner

asked Mar 06 '14 at 09:58

Alexander Komarov

109
2
8

0

votes

3 answers

Provide map splits with splits of the same file

How can I provide each line of a file fed to the mapper with splits of the same file? Basically what i want to do is for each line in file-split { for each line in file{ //process } } Can i do this using map reduce in…

java hadoop mapreduce hadoop-partitioning

asked Mar 01 '14 at 17:19

Nitin J

78
1
2
9

0

votes

1 answer

using hive to select data within large range partitions

I've came out some problem using hive to select data within large range partitions Here's the HQL I want to execute: INSERT OVERWRITE TABLE summary_T partition(DateRange='20131222-20131228') select col1, col2, col3 From RAW_TABLE where cdate…

hadoop hql hive hadoop-partitioning

asked Feb 21 '14 at 07:48

Dennis Shen

61
1
6

0

votes

1 answer

What two different keys go to the same reducer by the default hash partitioner in Hadoop?

As we know that Hadoop guarantees that the same keys which come from different mappers will be sent to the same reducer. But if two different keys have the same hash value, they definitely will go to the same reducer, so will them be sent to the…

java hadoop hadoop-partitioning

asked Dec 24 '13 at 08:14

Judking

6,111
11
55
84

0

votes

2 answers

Partitioner or MultipleOutputs

I would like to have your opinion regarding Partitioner vs MultipleOutputs. Suppose I have a file which contains keys as 0:aaa 1:bbb 0:ccc 0:ddd ... 1:zzz I would like have 2 files: one file containing keys starting with 0: and the…

hadoop mapreduce hadoop-partitioning reducers partitioner

asked Nov 30 '13 at 05:35

JohnRossy

63
5

0

votes

3 answers

how to group by data from hive with specific partition?

I have the following: hive>show partitions TABLENAME pt=2012.07.28.08 pt=2012.07.28.09 …

hive hiveql hadoop-partitioning

asked Oct 30 '13 at 09:18

user2935539

73
2
6

0

votes

0 answers

Hadoop disk usage (intermediate reduce)

I' new in Hadoop, I'm using a cluster and I have a disk quote of 15GB. If I try to execute the wordcount sample on a big dataset (about 25GB) I receveid always the exception "The DiskSpace quota of xxxx is exceeded: ". I checked my disk usage after…

java apache hadoop cloudera hadoop-partitioning

asked Oct 29 '13 at 19:54

user2933702

1

0

votes

1 answer

Sending data from all mappers to all reducers

Before this question is flagged duplicate, please read through. This has been asked many number of times with no clear answer. Lets say my task is to compute unigram probability for every word in millions of files. I can emit word counts from…

hadoop hadoop-partitioning

asked Oct 10 '13 at 21:33

abhinavkulkarni

2,284
4
36
54

0

votes

2 answers

How does Hadoop decide to distribute among buckets/nodes?

I am new to Map/Reduce and Hadoop framework. I am running a Hadoop program on single machine (for trying it out). I have n input files and I want some summary of words from those files. I know map function returns key value pair, but how map is…

hadoop mapreduce distributed-computing hadoop-partitioning

asked Sep 22 '13 at 17:45

Palash Kumar

429
6
18

0

votes

0 answers

How to best decide mapper output/reducer input for a huge string

I need to improve my MR jobs which uses HBase as source as well as sink.. Basically, i'm reading data from 3 HBase Tables in the mapper, writing them out as one huge string for the reducer to do some computation and dump into a HBase Table..…

java optimization hadoop hbase hadoop-partitioning

asked Sep 21 '13 at 06:30

Pavan

658
2
7
28

0

votes

1 answer

How input of small size is read by a mapper in map-reduce?

I have a map-reduce job whose input is a big data set (let's say of size 100GB). What this map-reduce job does is splitting the big data into chunks and writing separate files, one per each data chunk. That is, the output of the job is multiple…

hadoop mapreduce hadoop-partitioning

asked Sep 20 '13 at 17:36

HHH

6,085
20
92
164

0

votes

1 answer

How the input file gets split into chunks by the map-reduce framework?

I have an iterative mapreduce job in which, when a chunk, let's say Chunk i, is read by a mapper some information regarding the records within this chunk is stored in an auxiliary file, called F_i. In the next iteration (job), a different mapper…

hadoop mapreduce hadoop-partitioning

asked Sep 19 '13 at 17:34

HHH

6,085
20
92
164

0

votes

2 answers

can reduce task accept compressed data in hadoop

we see that map can accept and output compressed and uncompressed data. I was going through cloudera training and teacher mentioned that reduce task input has to be in form o key value and thus can't work on compressed data. Is that right? If thats…

hadoop mapreduce hadoop-partitioning

asked Aug 29 '13 at 20:03

bruceparker

1,235
1
17
33

Questions tagged [hadoop-partitioning]