Questions tagged [hadoop-partitioning]

Hadoop partitioning deals with questions about how hadoop decides which key/value pairs are to be sent to which reducer (partition).

339 questions
0
votes
1 answer

partitioning not working in hadoop

so in my code i have partition the data in three parts but in output i m only getting the ouput that is retuned by 0th partition even if i set no of reducers to 3 my code public static class customPartitioner extends Partitioner{ …
user2580745
0
votes
1 answer

custom partitioner to send single key to multiple reducers?

If I have only one key. Can I avoid it being sent to only one reducer (and distribute it across multiple reducers)? I understand that then I might have to have a second map reduce program to combine the reducer outputs? Is this a good approach? Or…
Gadam
  • 2,674
  • 8
  • 37
  • 56
0
votes
1 answer

hadoop partitioner not working

public class Partitioner_2 implements Partitioner{ @Override public int getPartition(Text key, Text value, int numPartitions) { int hashValue=0; for(char c:…
Nikhil
  • 545
  • 1
  • 7
  • 18
0
votes
1 answer

Composite key getting changed, Hadoop Map-Reduce?

I have just started learning hadoop,and running hadoop map-reduce program with custom partitioner and comparator.The problem i am facing is that the primary and secondary sort are not getting done on composite key, more-over the part of one…
Bruce_Wayne
  • 1,564
  • 3
  • 18
  • 41
0
votes
1 answer

Difference between Launched reduce tasks and number of times reduces function called?

i have just started learning hadoop,and running hadoop map-reduce program with custom partitioner and comparator(trying it on single node environment first, will later deploy on cluster), the strange behavior(as i don't know what actually is going…
Bruce_Wayne
  • 1,564
  • 3
  • 18
  • 41
0
votes
1 answer

Issue while running hadoop pipes in hadoop -1.2.1

Hello everybody, Earlier I was getting an issue while running the c++ binaries in hadoop syscon@syscon-OptiPlex-3020:~/uday/hadoop-1.2.1$ bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true…
0
votes
1 answer

How data will be stored to primary after the crash in hbase

I'm newbie to HBase. Assume that we have master and secondary regions. Just assume that our primary region goes down for few hours due to some external factors. if the primary server is turned back to normal status. It might have missed some amount…
Rocky
  • 309
  • 1
  • 3
  • 12
0
votes
1 answer

is having customPartitioner helpful if I already implement hashcode for keys in Map-reduce jobs?

I am writing a custom key class, without hashCode implementation. I run a map-reduce job, but during the job configuration, I set the partitoner class: such as Job job = Job.getInstance(config); …
brain storm
  • 30,124
  • 69
  • 225
  • 393
0
votes
1 answer

Control intermediates results in hadoop

I want to take control of intermediate results between Map and Reduce with hadoop. I would want to specify where copy these results after the Map. I would to choose data which will be reduced. In summary i want map's results before process…
0
votes
0 answers

Querstion regarding hadoop-env.sh

I am Facing Error: Java heap space and Error: GC overhead limit exceeded So i started looking into hadoop-env.sh. so thats what i understand so far, Please correct me if i am wrong. if HADOOP_HEAPSIZE=7168 in hadoop-env.sh this will invoke…
user2950086
  • 135
  • 1
  • 1
  • 13
0
votes
0 answers

Hadoop partitioning Map Tasks

I have a Hadoop Map Reduce job where I have split the input using a line reader. Map input records=10. Is it possible to partition the map tasks output based on the LongWritable key that identifies the line reader split ? If not - Is there another…
0
votes
1 answer

Hadoop's Distributed Cache File program generates no output

We are trying to design a simple program, where the goal is to read the patent data from a file, and check if other countries have cited that patent or not, this is from the text book 'Hadoop in Action' by 'chuck Lam', where we are trying to learn…
vamosrafa
  • 685
  • 5
  • 11
  • 35
0
votes
2 answers

Update mysql record from hadoop

I completed a process that read iTunes EPF file and insert those record in mysql data base table. In which, before inserting the record I need to check whether the given record exist in data base or not.If the record not exist then I shall insert…
0
votes
3 answers

hadoop command to find namenode in a node

I tried Steps - Login into particular node - and execute the command jps Result 5144 JobTracker 4953 NameNode 5079 SecondaryNameNode 5216 Jps this is working fine, but I what know any other command to find namenode in a node
AlexAnand
  • 169
  • 4
  • 18
0
votes
1 answer

Can I get a Partition number of Hadoop?

I am a hadoop newbie. I want to get a partition number on output file. At first, I made a customized partitioner. public static class MyPartitioner extends Partitioner { public int getPartition(Text key, LongWritable…