Hadoop partitioning deals with questions about how hadoop decides which key/value pairs are to be sent to which reducer (partition).
Questions tagged [hadoop-partitioning]
339 questions
0
votes
1 answer
partitioning not working in hadoop
so in my code i have partition the data in three parts but in output i m only getting the ouput that is retuned by 0th partition even if i set no of reducers to 3
my code
public static class customPartitioner extends Partitioner{
…
user2580745
0
votes
1 answer
custom partitioner to send single key to multiple reducers?
If I have only one key. Can I avoid it being sent to only one reducer (and distribute it across multiple reducers)?
I understand that then I might have to have a second map reduce program to combine the reducer outputs?
Is this a good approach? Or…

Gadam
- 2,674
- 8
- 37
- 56
0
votes
1 answer
hadoop partitioner not working
public class Partitioner_2 implements Partitioner{
@Override
public int getPartition(Text key, Text value, int numPartitions) {
int hashValue=0;
for(char c:…

Nikhil
- 545
- 1
- 7
- 18
0
votes
1 answer
Composite key getting changed, Hadoop Map-Reduce?
I have just started learning hadoop,and running hadoop map-reduce program with custom partitioner and comparator.The problem i am facing is that the primary and secondary sort are not getting done on composite key, more-over the part of one…

Bruce_Wayne
- 1,564
- 3
- 18
- 41
0
votes
1 answer
Difference between Launched reduce tasks and number of times reduces function called?
i have just started learning hadoop,and running hadoop map-reduce program with custom partitioner and comparator(trying it on single node environment first, will later deploy on cluster), the strange behavior(as i don't know what actually is going…

Bruce_Wayne
- 1,564
- 3
- 18
- 41
0
votes
1 answer
Issue while running hadoop pipes in hadoop -1.2.1
Hello everybody,
Earlier I was getting an issue while running the c++ binaries in hadoop
syscon@syscon-OptiPlex-3020:~/uday/hadoop-1.2.1$ bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true…

user3532122
- 15
- 4
0
votes
1 answer
How data will be stored to primary after the crash in hbase
I'm newbie to HBase. Assume that we have master and secondary regions.
Just assume that our primary region goes down for few hours due to some external factors. if the primary server is turned back to normal status.
It might have missed some amount…

Rocky
- 309
- 1
- 3
- 12
0
votes
1 answer
is having customPartitioner helpful if I already implement hashcode for keys in Map-reduce jobs?
I am writing a custom key class, without hashCode implementation.
I run a map-reduce job, but during the job configuration, I set the partitoner class:
such as
Job job = Job.getInstance(config);
…

brain storm
- 30,124
- 69
- 225
- 393
0
votes
1 answer
Control intermediates results in hadoop
I want to take control of intermediate results between Map and Reduce with hadoop.
I would want to specify where copy these results after the Map.
I would to choose data which will be reduced.
In summary i want map's results before process…

user3783064
- 23
- 4
0
votes
0 answers
Querstion regarding hadoop-env.sh
I am Facing Error: Java heap space and Error: GC overhead limit exceeded
So i started looking into hadoop-env.sh.
so thats what i understand so far, Please correct me if i am wrong.
if HADOOP_HEAPSIZE=7168 in hadoop-env.sh
this will invoke…

user2950086
- 135
- 1
- 1
- 13
0
votes
0 answers
Hadoop partitioning Map Tasks
I have a Hadoop Map Reduce job where I have split the input using a line reader.
Map input records=10.
Is it possible to partition the map tasks output based on the LongWritable key that identifies the line reader split ?
If not - Is there another…

Chris MacKenzie
- 1
- 1
0
votes
1 answer
Hadoop's Distributed Cache File program generates no output
We are trying to design a simple program, where the goal is to read the patent data from a file, and check if other countries have cited that patent or not, this is from the text book 'Hadoop in Action' by 'chuck Lam', where we are trying to learn…

vamosrafa
- 685
- 5
- 11
- 35
0
votes
2 answers
Update mysql record from hadoop
I completed a process that read iTunes EPF file and insert those record in mysql data base table.
In which, before inserting the record I need to check whether the given record exist in data base or not.If the record not exist then I shall insert…

gangatharan
- 781
- 1
- 12
- 28
0
votes
3 answers
hadoop command to find namenode in a node
I tried
Steps
- Login into particular node
- and execute the command jps
Result
5144 JobTracker
4953 NameNode
5079 SecondaryNameNode
5216 Jps
this is working fine, but I what know any other command to find namenode in a node

AlexAnand
- 169
- 4
- 18
0
votes
1 answer
Can I get a Partition number of Hadoop?
I am a hadoop newbie.
I want to get a partition number on output file.
At first, I made a customized partitioner.
public static class MyPartitioner extends Partitioner {
public int getPartition(Text key, LongWritable…

user3527158
- 3
- 3