Questions tagged [hive-partitions]

To be used for questions regarding partitions in hive.

Partitioning is a way of dividing a table into related parts based on the values of partitioned columns such as date, city, and department. Using partition, it is easy to query a portion of the data.

Partitions are essentially horizontal slices of data which allow larger sets of data to be separated into more manageable chunks. In Hive, partitioning is supported for both managed and external tables in the table definition as seen below.

144 questions
0
votes
1 answer

Does Hive partition column have partition effect after converting the partition column?

For example, I have a table partitioned by column ym (202001). Now, there is a SQL converting ym to other time format: select * from table where from_unixtime(unix_timestamp(`table`.ym , 'yyyyMM')) >= '2020-01-01 00:00:00') AND…
DennisLi
  • 3,915
  • 6
  • 30
  • 66
0
votes
1 answer

hive partition by time

I want to implement alter table dos_sourcedata add partition (data = to_date (current_timestamp ())); in hive Run this statement at a specific time every day. but this is always wrong.
0
votes
0 answers

Drop partition from hive table using sub_date dynamically

I need to drop from a hive table dynamically and this is the way I am trying to do: set hivevar:range=select date_sub(date '2019-10-21',19); hive> ${hivevar:range}; OK 2019-10-02 Time taken: 0.417 seconds, Fetched: 1 row(s) But when…
Alan
  • 417
  • 1
  • 7
  • 22
0
votes
0 answers

Hive Partition Table with Date Datatype via Spark

I have a scenario and would like to get an expert opinion on it. I have to load a Hive table in partitions from a relational DB via spark (python). I cannot create the hive table as I am not sure how many columns there are in the source and they…
Saim
  • 1
0
votes
1 answer

Add new partition to already partitioned hive table

I have a partitioned table Student which already has one partition column dept. I need to add new partition column gender Will it be possible to add this new partition column in already partitioned hive table. The table data does not have gender…
techie
  • 313
  • 1
  • 8
  • 23
0
votes
1 answer

Create hive partition based on time zone

I'm trying to materialize hive table based on file that are stored as parquet in GCS, with path like gs://abc/dt=02-02-2019/hr=02(physical partition based on UTC) Now I want to create two hive table where the logical partition is based on timezone,…
Sandeep
  • 131
  • 1
  • 9
0
votes
2 answers

how can we rename multiple partitions in Hive?

If have two partitioned columns for eg. school name and class How can I rename a specific class partition which is present inside all school partitions so, /school=ABC/class=1/ /school=PQR/class=1/ . . . . class = 1 should be transformed to class =…
rishabh
  • 96
  • 6
0
votes
1 answer

How to use a UDF value or column value in hive insert partition statement, rather than constant value

I have a data table created as below: CREATE EXTERNAL TABLE `DailyData`( `entity_id` string, `payload` string) PARTITIONED BY (`date_of_data` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0010' STORED AS INPUTFORMAT …
DeepNightTwo
  • 4,809
  • 8
  • 46
  • 60
0
votes
2 answers

Spark can I manually specify the number of partitions when do textFile

The spark will automatically decide the number of partitions base on the size of the input file. I have two questions: Can I specify the number of the partition rather than let the spark decide how much partitions? How bad is shuffle when doing the…
Brian Z
  • 99
  • 1
  • 9
0
votes
0 answers

Multi Table Insert into Single table in Hive

I have a partitioned hive table partitioned on column 'part'. The table has two partition values part='good' and part='bad'. I need to move a record from 'bad' partition into 'good' partition and overwrite 'bad' partition to remove that moved…
Adiga
  • 121
  • 6
0
votes
1 answer

How to create/copy data to partitions in hive manually

I am working on a hive solution wherein I need to append some values to the high volume files. So instead of appending them, I am trying using map-reduce method The approach is below Table creation: create external table demo_project_data(data…
Karthi
  • 708
  • 1
  • 19
  • 38
0
votes
1 answer

Does dropping a partition from hive table drops it's subpartitions?

I have an external hive table which has partitions like year = 2017, year = 2018 and inside them I've partitions for each month for year = 2017 and year = 2018 as well. My questions are: If I drop partition year = 2017, will it drop all the month…
Amol.Shaligram
  • 713
  • 1
  • 5
  • 12
0
votes
0 answers

Hive’s dynamic partitioning failing to write final files

I’m trying to load data from a table with 1 column partitioned to a new table that has 2 partitioned columns, with the newer partitioned column being a regular column from the first table. For example the create table statements (simplified and…
dl8
  • 1,270
  • 1
  • 14
  • 34
0
votes
2 answers

set partition location in 'Insert Overwrite' dynamic partition query in hive

I've created a hive table with base location pointing to AWS S3 location. However, I want to create a partition on HDFS cluster using 'Insert Overwrite' query. Steps below: -- Create intermediate table create table test_int_ash ( loc…
Ash
  • 1,180
  • 3
  • 22
  • 36
0
votes
1 answer

Sampling results with conditions in hive sql

I have a table that doesn't have a primary key and is partitioned by date; columns like this: 1. user_id 2. device 3. region 4. datetime 5. and other columns It contains user generated events from a website game, they trigger every second. I want…
Andrew O
  • 13
  • 5
1 2 3
9
10