Highest Voted 'partitioner' Questions

2

votes

1 answer

Springbatch dynamic multiple xml File writer

I have to do a batch that : read some data from DB (each row is an item, this is fine) then do some process to add some more data (more data is always better ;) ) then here is my problem, I have to write each item in an xml file who's name depends…

asked Jan 19 '15 at 10:45

bodtx

590
9
29

2

votes

2 answers

Why is the Partitioner invoked even with a single reducer

If we have a MR Job configured to run only with a single reducer it seems logical that a Partitioner need not be invoked. However i just gave this a shot and it looks like the Partitioner is invoked even if the job is configured with a single…

hadoop mapreduce partitioner

asked Apr 15 '14 at 11:56

Sudarshan

8,574
11
52
74

2

votes

2 answers

Custom Counter inside the Hadoop Partitioner

I would like to capture some information on keys and their values inside a custom Partitioner (or even the default HashPartitioner). I can use custom counters inside both mappers and reducers by accessing the "context" variable. However, inside the…

hadoop mapreduce partitioner

asked Apr 16 '13 at 18:17

user2262938

81
4

1

vote

2 answers

Hadoop order of operations

According to the attached image found on yahoo's hadoop tutorial, the order of operations is map > combine > partition which should be followed by reduce Here is my an example key emmited by the map operation LongValueSum:geo_US|1311722400|E …

hadoop partitioner combiners

asked Aug 05 '11 at 20:58

Premal Shah

181
4
13

1

vote

1 answer

Kafka RoundRobin partitioner not distributing messages to all the partitions

I am trying to use Kafka's RoundRobinPartitioner class for distributing messages evenly across all the partitions. My Kafka topic configuration is as follows: name: multischemakafkatopicodd number of partitions: 16 replication factor: 2 Say, if I am…

apache-kafka kafka-producer-api round-robin partitioner

asked Dec 18 '20 at 20:25

Swapnil Gupta

25
1
6

1

vote

0 answers

Is Optaplanner Strength Comparator compatible with Partitioning?

Has anyone tried Optaplanner's Partitioned Search feature at the same time as the strength comparator class?? Firstly, I created a custom partitioner that splits the planning entities and assigns the planning values (it does not split the planning…

sorting optaplanner partitioner

asked Apr 28 '20 at 12:07

pineapplw

71
3

1

vote

2 answers

Technique for joining with spark dataframe w/ custom partitioner works w/ python, but not scala?

I recently read an article that described how to custom partition a dataframe [ https://dataninjago.com/2019/06/01/create-custom-partitioner-for-spark-dataframe/ ] in which the author illustrated the technique in Python. I use Scala, and the…

apache-spark join apache-spark-sql rdd partitioner

asked Aug 10 '19 at 07:37

Chris Bedford

2,560
3
28
60

1

vote

2 answers

How to properly apply HashPartitioner before a join in Spark?

To reduce shuffling during the joining of two RDDs, I decided to partition them using HashPartitioner first. Here is how I do it. Am I doing it correctly, or is there a better way to do this? val rddA = ... val rddB = ... val numOfPartitions =…

scala apache-spark rdd partitioner

asked Mar 21 '19 at 11:54

MetallicPriest

29,191
52
200
356

1

vote

0 answers

Spark even data distribution

I am trying to solve skewed data problem in the dataframe. I have introduced a new column based on bin packing algorithm which should evenly distribute data among the bins (partitions in my case). My count for the bin is 500,000 rows. I have…

scala apache-spark apache-spark-sql partitioner

asked Nov 30 '18 at 13:41

Waqar Ahmed

5,005
2
23
45

1

vote

1 answer

Customize Partitioner to balance inputs to reducers

Suppose my mappers output N keys (these keys are different), and I have K reducers. How to write custom Paritioner so that each reducer receive approximately N/K keys? Which keys going to which receives is not important. Example: Suppose my mappers…

hadoop mapreduce reducers partitioner

asked Jun 22 '18 at 03:07

cdt

85
10

1

vote

1 answer

type HashPartitioner is not a member of org.apache.spark.sql.SparkSession

I was using spark-shell to experiment with Spark's HashPartitioner. The error is shown as follows: scala> val data = sc.parallelize(List((1, 3), (2, 4), (3, 6), (3, 7))) data: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[0] at…

apache-spark partitioner

asked May 24 '17 at 19:16

Xiangyu

824
9
34

1

vote

0 answers

Why does hadoop partitioner do a binary AND?

I'm completely new to Hadoop and fairly new to Map/Reduce so bear with me if this is a very simple question. In hadoop's hash partitioner, why does it do a hash(key) & Integer.MAX_VALUE before doing a modulo with the number of reducers? What is the…

hadoop mapreduce partitioner

asked Oct 12 '16 at 07:14

Kevin

3,209
9
39
53

1

vote

1 answer

How to use Distributed cache in partitioner hadoop?

I am new in hadoop and mapreduce partitioner.I want to write my own partitioner and i need to read a file in partitioner. i have searched many times and i get that i should use distributed cache. this is my question that how can i use distributed…

hadoop mapreduce partitioner

asked Sep 20 '16 at 07:00

Saeed Nasehi

940
1
11
27

1

vote

2 answers

Does the default hash partitioner still work if a custom partitioner is defined in Hadoop Map Reduce?

As I am new to hadoop,I tried out the sample code from http://www.tutorialspoint.com/map_reduce/map_reduce_partitioner.htm I found that the program uses 3 different partitions based on age group and 3 reducers are also used , which is expected. But…

hadoop mapreduce partitioner

asked Nov 24 '15 at 20:09

Chandan Kumar Bala

27
3

1

vote

2 answers

What if a custom partitioner is made to select different partitions for records having the same key?

While learning Hadoop MapReduce, I came across how to create a custom Partitioner class. I understand that we need to define the abstract getPartition method in our class. This method is supposed to return the Partition number (an integer) for the…

java hadoop mapreduce partitioner

asked Sep 02 '15 at 10:14

Ankit Khettry

997
1
13
33

Questions tagged [partitioner]