Highest Voted 'hadoop-partitioning' Questions

1

vote

1 answer

hive explain plan not showing partition

I have a table which contains 251M records and size is 2.5gb. I created a partition on two columns which I am doing condition in predicate. But the explain plan is not showing it is reading partition even though I have partitioned. With…

hive hadoop-partitioning

asked Jun 23 '17 at 03:01

D Mishra

35
4

1

vote

0 answers

How to arrange multi partitions in hive?

say i have a order table, which contains multi time column(spend_time,expire_time,withdraw_time), usually,i will query the table with the above column independently,so how do i create the partitions? order_no | spend_time | expire_time |…

hadoop hive partitioning hadoop-partitioning

asked Jun 02 '17 at 06:13

lei yu

58
6

1

vote

1 answer

_temporary directory is not getting deleted from output location when mapreduce job is completed

I am parsing a data in order to get some sense out of it through MapReduce job. The parsed data comes in form of batches. It is further loaded to hive external table through spark streaming job. This is a real time process. Now an unusual event was…

hive mapreduce hadoop2 hadoop-partitioning bigdata

asked May 05 '17 at 08:57

Mohit Sudhera

341
1
4
16

1

vote

1 answer

MAX(Count) function apache pig latin

This below program I am trying to do it in Apache Pig as it is and unstructured data i) I have dataset which contains street name, city and state: ii) Group by state iii) I am taking COUNT(*) of states in the dataset Now my o/p will be like…

hadoop apache-pig hadoop-streaming hadoop-partitioning

asked Mar 01 '17 at 01:12

sivaraj

49
1
5

1

vote

0 answers

Why and What changes should be done to Driver class in mapreduce program when using stringtokenizer instead of split()

I am new to java and hadoop. I was practicing mapreduce wordcount example where I came across 2 way of splitting the line in mapper class. 1st one public class WordCountMapper extends Mapper…

java hadoop mapreduce hadoop2 hadoop-partitioning

asked Feb 23 '17 at 06:41

Vidya

154
1
17

1

vote

1 answer

How to rename all partition columns in hive

When I am trying to rename all partition columns in an existing table for date range of one year which are partitioned - this is what I am getting. hive> ALTER TABLE test.usage PARTITION ('date') RENAME TO PARTITION (partition_date); FAILED:…

hadoop hiveql hadoop-partitioning apache-hive

asked Feb 14 '17 at 18:11

hadoop

45
2
5

1

vote

1 answer

How to overwrite columns value by selecting another columns in partition table in hive

Hi how to overwrite columns value by selecting same partition table in hive. I have created table by executing below query CREATE TABLE user (fname string,lname string) partitioned By (day int); And i insert the data , after inserting data into…

sql hadoop hive hiveql hadoop-partitioning

asked Feb 09 '17 at 09:42

Sai

1,075
5
31
58

1

vote

1 answer

How to merge small files from existing partitions in hive?

How to merge existing Partition small files into one large file in one of the Partition . For example I have a table user1, it contain columns fname,lname and partition column is day. I have created table by using below script CREATE TABLE…

sql hadoop hive hiveql hadoop-partitioning

asked Feb 07 '17 at 13:27

Sai

1,075
5
31
58

1

vote

1 answer

who will create the block ids for blocks in hadoop?

I wanted to know who will create the block ids for blocks in hadoop either HDFS client or Name node.Please let me know.

hadoop hadoop2 hadoop-streaming hadoop-partitioning

asked Jan 26 '17 at 05:59

sidhartha pani

623
2
12
23

1

vote

1 answer

Who will update metdata in Name node in Hadoop?

In case of HDFS writes how metadata is being updated in Name node. Once client writes the data to the Data nodes. Either Data nodes or HDFS client will update the metadata in Name node.

hadoop hadoop2 hadoop-streaming hadoop-partitioning

asked Jan 25 '17 at 12:07

sidhartha pani

623
2
12
23

1

vote

1 answer

Hadoop INFO ipc.Client: Retrying connect to server localhost/127.0.0.1:9000

I read other posts about the HDFS configuration problem with Hadoop. However, none of them was helpful. So, I post my question. I followed this tutorial for hadoop v1.2.1. When I am running hadoop fs -ls command I've got this error: 16/08/29…

linux hadoop hdfs hadoop-partitioning

asked Aug 29 '16 at 19:33

Hamid_UMB

317
4
16

1

vote

3 answers

How to reduce number of mappers, when I am running hive query?

I am using hive , I have 24 json files with total size of 300MB (in one folder), so I have created one external table(i.e table1) and I loaded the data(i.e 24 files ) Into external table. When I am running select query on top of that external…

hadoop mapreduce hive cloudera hadoop-partitioning

asked Jul 19 '16 at 12:53

Sai

1,075
5
31
58

1

vote

1 answer

how to properly import csv data set using kite-dataset partitioned schema?

I'm working with the publicly-available csv dataset from MovieLens I have created a partitioned dataset for the ratings.csv: kite-dataset create ratings --schema rating.avsc --partition-by year-month.json --format parquet Here is my…

hadoop hdfs cloudera-cdh hadoop-partitioning kite-dataset

asked Jun 12 '16 at 19:18

Eugene Goldberg

14,286
20
94
167

1

vote

0 answers

how to distribute java pair RDD data based on key to different partitions of RDD

JavaRDD input = xyz.sc.textFile("/home/spark/Documents/XYZ"); JavaRDD infoRDD = input.mapToPair(new PairFunction(){ public Tuple2 call(String x) { return new…

java apache-spark rdd hadoop-partitioning

asked May 08 '16 at 17:17

gaurav

46
6

1

vote

2 answers

DELETE FROM table_name Cloudera Impala

I'm new on Impala, and I'm trying to understand how to delete records from a table... I've tried looking for delete commands, but didn't quite find understandable instructions... This is my table structure: create table Installs (BrandID INT,…

hadoop impala hadoop-partitioning

asked Apr 12 '16 at 09:45

Bramat

979
4
24
40

Questions tagged [hadoop-partitioning]