Highest Voted 'hadoop-partitioning' Questions

0

votes

1 answer

hadoop partitioner getting incorrect reduce count

I'm working on partitioner today. Its the basic program in hadoop custom partitioners. Below is my partitioner code snippet. public class VowelConsPartitioner extends Partitioner { @Override public int getPartition(Text letterType, IntWritable…

hadoop-partitioning

asked Aug 25 '13 at 05:30

Santosh Batta

1
1

0

votes

2 answers

Get max salary employee name using hadoop map reduce

i am very new to M/R programs..i have a file in HDFS with data in this structure EmpId,EmpName,Dept,Salary, 1231,userName1,Dept1,5000 1232,userName2,Dept2,6000 1233,userName3,Dept3,7000 . . …

hadoop mapreduce hadoop-partitioning

asked Aug 16 '13 at 07:45

user1585111

1,019
6
19
35

0

votes

1 answer

DiskErrorException on slave machine - Hadoop multinode

I am trying to process XML files from hadoop, i got following error on invoking word-count job on XML files . 13/07/25 12:39:57 INFO mapred.JobClient: Task Id : attempt_201307251234_0001_m_000008_0, Status : FAILED Too many fetch-failures 13/07/25…

hadoop mapreduce hadoop-streaming hadoop-plugins hadoop-partitioning

asked Jul 25 '13 at 07:19

Surya

3,408
5
27
35

0

votes

1 answer

Error on starting HDFS daemons on hadoop Multinode cluster

Issue While Hadoop multi-node set-up .As soon as i start My hdfs demon on Master (bin/start-dfs.sh) i did got below logs on Master starting namenode, logging to…

hadoop hadoop-streaming hadoop-plugins hadoop-partitioning

asked Jul 24 '13 at 07:03

Surya

3,408
5
27
35

0

votes

1 answer

Hadoop command line explanation

Can some one explain me this syntax , bin/hadoop jar hadoop*examples*.jar wordcount /user/hpuser/testHadoop /user/hpuser/testHadoop-output Why are we using jar soon after bin/hadoop What does hadoop*examples*.jar means..? Do wordcount is name of…

hadoop hadoop-partitioning

asked Jul 23 '13 at 06:07

Surya

3,408
5
27
35

0

votes

2 answers

Creating more partitions than reducers

When developing locally on my single machine, I believe the default number of reducers is 6. In a particular MR step, I actually divide up the data into n partitions where n can be greater than 6. From what I have observed, it looks like only 6 of…

hadoop hadoop-streaming hadoop-partitioning

asked Jun 27 '13 at 01:38

syker

10,912
16
56
68

0

votes

1 answer

Generating multiple equally sized output files in Hadoop

What are some methods for finding X data ranges in Hadoop so that one can use these ranges as partitions in the reducer step?

hadoop data-partitioning hadoop-partitioning

asked Jun 19 '13 at 19:27

syker

10,912
16
56
68

0

votes

1 answer

Hadoop file system is physical file system or virtual file system

hadoop hdfs hadoop-streaming hadoop-partitioning hdfstore

asked May 04 '13 at 07:03

user2183044

33
2
6

0

votes

2 answers

hadoop distribute partitions to reducer

For load balancing reasons, I want to create more partitions than reducers in a Hadoop environment. Is there a way to assign partitions to a specific reducers and if so, where can I define them. I wrote a individual Partitioner and want now to…

hadoop hadoop-partitioning

asked Apr 26 '13 at 09:06

beto8888

45
1
4

0

votes

3 answers

how to work on specific part of cvs file uploaded into HDFS?

how to work on specific part of cvs file uploaded into HDFS ? I'm new in Hadoop and i have an a question that is if i export an a relational database into cvs file then uploaded it into HDFS . so how to work on specific part (table) in file using…

hadoop hadoop-streaming hadoop-partitioning

asked Apr 17 '13 at 15:27

Samy Louize Hanna

821
2
8
15

0

votes

2 answers

How to use hadoop MapReuce framework for an Opencl application?

I am developing an application in opencl whose basic objective is to implement a data mining algorithm on GPU platform. I want to use Hadoop Distributed File System and want to execute the application on multiple nodes. I am using MapReduce…

hadoop mapreduce opencl gpu hadoop-partitioning

asked Mar 19 '13 at 09:30

sandeep.ganage

1,409
2
21
47

0

votes

1 answer

How to increase hadoop map tasks by implementing getSplits

I want to process multiline CSV files and for that I wrote a custom CSVInputFormat. I would like to have about 40 threads processing CSV lines on each hadoop node. However, when I create a cluster on Amazon EMR with 5 machines (1 master and 4…

csv hadoop amazon-emr hadoop-partitioning

asked Jan 28 '13 at 23:17

mvallebr

2,388
21
36

0

votes

1 answer

how to Load key-value data into hbase tables?

Thanks for taking interest in my question. Before I begin, I'd like to let you know that I'm very new to Hadoop & HBase. So far, I find Hadoop very interesting and would like to contribute more in the future. I'm primarily interested in improving…

hadoop hbase apache-pig key-value-store hadoop-partitioning

asked Dec 21 '12 at 09:35

MapReddy Usthili

288
1
7
23

0

votes

2 answers

Apache Hive how to identify which column is the partition

I have a set of log files, created a Hive table, now i want to partition the table based on a col what I don't understand & have not seen examples is how to specify the column for partition how to specify the col/field Ex. here is line from the log…

hadoop hive hadoop-partitioning

asked Apr 20 '12 at 17:33

Integration

337
1
4
15

-1

votes

1 answer

Hive Managed vs External tables maintainability

Which one is better (performance wise and operation on the long run) in maintaining data loaded, managed or external? And by maintaining, i mean that these tables will have the following operations on daily basis frequently; Select using partitions…

hadoop hive hiveql hadoop2 hadoop-partitioning

asked Nov 03 '19 at 18:25

amr007

29
1
8

Questions tagged [hadoop-partitioning]