Questions tagged [hadoop-partitioning]

Hadoop partitioning deals with questions about how hadoop decides which key/value pairs are to be sent to which reducer (partition).

339 questions
0
votes
1 answer

Custom Partitioner not working in Oozie Mapreduce action

I have implemented secondary sort in mapreduce and trying to execute it using Oozie (From Hue). Though I have set the partitioner class in the properties, the partitioner is not being executed. So, I'm not getting output as expected. The same…
0
votes
0 answers

Issue in Inserting data to hive partition table with over 100k partitions

I created a staging table with 20 million records with only two field viewerid and viewedid. From that i am trying to create a dynamic partitions ORC table with "viewerid" column, but map job is not completing as shown in the attached…
0
votes
1 answer

Convert string into date form

I have a column with string partition=201707070800, I need to convert this to 2017-07-08 , How can we achieve this is hive ? Thanks
Ganesh
  • 167
  • 1
  • 7
  • 21
0
votes
1 answer

How to partition unequal distributed events on timeline?

I'm working on an event processing system where I have to read my event data from a hbase table. The events I read are stored based on their timestamp. When I read in a whole day (24 hours), I find periods on the day where I 1 million events per…
Matthias Mueller
  • 102
  • 2
  • 12
0
votes
1 answer

how to drop hive partitions with dynamic values

I'm looking for a way to drop partitions in relation to the current day. alter table table_name drop partition(rep_date < from_unixtime(unix_timestamp(),'yyyy-MM-dd')); This returns an error: cannot recognise input near from(unix... I can do this…
user2331566
  • 139
  • 3
  • 16
0
votes
1 answer

HIVE : Map Joins in partitioned tables

Considering a typical data warehouse scenario in hive with fact and dimension tables, say the fact table is split across multiple data nodes with partitions. While joining fact tables (which are partitioned) with dimensions (which are not…
0
votes
1 answer

Impala - Handle special characters on partition column

I am currently working on a job which copies data from a staging table to the final table. The column in the staging table which is used for partition on the final table has multiple records with single quotes (e.g. supplies'A, demand'A etc). Due to…
POJO
  • 41
  • 2
  • 8
0
votes
0 answers

How to Restrict some part of partitioned data, So that user cant run query on that particular data set in Hive?

Restrict some part of partitioned data, So that user cant run query on that particular partition in Hive
0
votes
1 answer

Hadoop MapReduce distinct pattern with custom Writable produces duplicate keys

I'm trying to implement the distinct pattern: map(key, record): emit record,null reduce(key, records): emit key My key is a complex, custom Writable. If I emit in the reduce the key and its hashcode: context.write(key, new…
mikmak
  • 13
  • 1
  • 5
0
votes
1 answer

Create zip tables in HDFS

I have tried to create table whic is not zip like this. CREATE TABLE example_table (| a BIGINT, b BIGINT, v STRING, d TINYINT ) STORED AS TEXTFILE LOCATION /path/to/directory/ It's not zip table. I want to also create new table with…
Beyhan Gul
  • 1,191
  • 1
  • 15
  • 25
0
votes
1 answer

Paritioning and Bucketing in Hive

My hive table will have call record data. 3 columns of the table are field1- CALL_DATE, field2-FROM_PHONE_NUM, field3- TO_PHONE I would query something like 1) i want to get all call records between particular dates. 2) I want to get all call…
AKC
  • 953
  • 4
  • 17
  • 46
0
votes
0 answers

Passing all file contents to map function in map reduce and appending it to sequence file

I have to read all contents of a fileA and pass it to map function. In map function, key is fileB and value is the contents of fileA. In outputFormat recordReader, I am appending all the values (all contents of FileA) to fileB using sequence file…
mahan07
  • 887
  • 4
  • 14
  • 32
0
votes
1 answer

Sort by chronological order in Hadoop

I'm a starter using Hadoop. Looking in the documentation of Apache Hadoop I've just found that data can be sorted mainly by numeric or alphabetical order. Here the link to the API's…
0
votes
0 answers

buckets are not getting created in hive

Look at the scripts, hive buckets are not creating after partitioning table, step 1. create table orders_bucket9 (order_id int,order_date string,order_customer_id int,order_status string) partitioned by (order_month string) clustered by…
Kumar
  • 1
  • 4
0
votes
0 answers

How do we create the partitions in hive with the Spaces?

We want to create the partitions in hive table, but the partition name have some spaces. So it cant create the partitions. Currently we are using the java. We tried to escape the space but all are throwing exception. URL…