Questions tagged [hive-partitions]

To be used for questions regarding partitions in hive.

Partitioning is a way of dividing a table into related parts based on the values of partitioned columns such as date, city, and department. Using partition, it is easy to query a portion of the data.

Partitions are essentially horizontal slices of data which allow larger sets of data to be separated into more manageable chunks. In Hive, partitioning is supported for both managed and external tables in the table definition as seen below.

144 questions

votes

1 answer

How do I partition a table by all values?

I have an external table, now I want to add partitions to it. I have 224 unique city id's and I want to just write alter table my_table add partition (cityid) location /path; but hive complains, saying that I don't provide anything for the city id…

asked Dec 27 '18 at 13:39

pavel_orekhov

1,657
2
15
37

votes

1 answer

Hive - fetch only the latest partition of one or more hive tables

I have three partitioned tables (Yearly Partitions) in Hive and all the tables have multiple partitions. I am joining all these three tables as part of my requirement. Now I want to run this sql only for the latest partition and not for the ones…

hive hiveql hive-partitions

asked Oct 31 '18 at 12:51

JKC

2,498
6
30
56

votes

1 answer

unable to delete hive table partition contains special character Equal sign(=)

inserted data in Hive table with partition column(CL) value as ('CL=18') which stored as /db/tbname/CL=CL%3D18 (invalid partition contains url encoded special character for equal sign). As per hortonworks community , it was mentioned hive stored…

hive special-characters hive-partitions

asked Oct 14 '18 at 17:40

Girish501

votes

2 answers

hive add partition statement ignores leading zero

I've folder on hdfs /user/test/year=2016/month=04/dt=25/000000_0 Need to add this above partition path to a test table. command : ALTER TABLE test ADD IF NOT EXISTS PARTITION (year=2016,month=04,dt=25) But this add partition command is ignoring the…

hadoop hive hadoop-yarn hive-partitions hiveddl

asked Jun 24 '16 at 15:47

pramav

vote

1 answer

Hive insert into partitioned table with colums list from select

I want to insert into a partitioned Hive table tb_1(a, b, c, d, p1) only columns (a, b) from a select statement. Ex: insert into table tb_1 partition (p1) (a, b) select a, b from tb_2; How can I achieve this?

hive hive-partitions

asked Jul 08 '22 at 01:30

david vallet

vote

1 answer

Databricks / Spark storage mechanism for Delta Tables, Delta Logs, Partitions etc

I am trying to understand how data is stored and managed in the DataBricks environment. I have a fairly decent understanding of what is going on under the hood but have seen some conflicting information online, therefore would like to get a detailed…

apache-spark pyspark databricks delta-lake hive-partitions

asked May 20 '22 at 13:44

guacacholay

vote

1 answer

Hive - incomplete rows in select from managed partitioned table

I need to copy data from a CSV file to a managed partitioned table in Hive. CSV file rows are: id,nome,cognome,ruolo 16,Mike,Maignan,Portiere 23,Fikayo,Tomori,Centrale 24,Simon,Kjaer,Centrale 19,Theo,Hernandez,Terzino ... ------- I…

hive missing-data managed hive-partitions hiveddl

asked Dec 13 '21 at 10:49

Moreno

vote

1 answer

How to retain last N partitions for a hive external table?

I need to retain say last 7 partitions and data of a given hive external table. This can be either done via a shell script or a hive hql script. The table is partitioned by intgestion_date=YYYY-MM-DD what would be the best way to find the cutoff…

bash shell hive hive-partitions hiveddl

asked Nov 30 '21 at 15:28

Gaurav Bhatnagar

vote

1 answer

Filtering is supported only on partition keys of type string Hive

We recently started facing issues with spark 2.4.4 with hive 1.2.1 version. when we are trying to read data from a table which is partition by string type columns spark.sql("select count(*) from table where #conditions"); Caused by:…

apache-spark hive apache-spark-sql hive-partitions

asked Jul 13 '21 at 17:35

kavetiraviteja

2,058
1
15
35

vote

1 answer

Data Loaded wrongly into Hive Partitioned table after adding a new column using ALTER

I already have a Hive partitioned table. I needed to add a new column to the table, so i used ALTER to add the column like below. ALTER TABLE TABLE1 ADD COLUMNS(COLUMN6 STRING); I have my final table load query like this: INSERT OVERWRITE table…

sql hive hiveql hive-partitions hiveddl

asked Jun 26 '21 at 14:43

jahan

vote

0 answers

Hive Query is going for full table scan when filtering on the partitions from the results of subquery/joins

Why query1 runs for longer time compared to Query2. Hive Source Table Details Columns - 166 Partitions columns - 2 columns (all are int datatypes) Number of Partitions - 3211 partitions Total Records - 19374461064 File Format - ORC --…

hive hiveql hive-partitions

asked Jun 15 '21 at 07:04

Vijiy

1,187
6
21

vote

1 answer

Hive load multiple partitioned HDFS file to table

I have some twice-partitioned files in HDFS with the following…

hadoop hive hdfs hive-partitions hiveddl

asked Jun 07 '21 at 23:04

hunterm

vote

1 answer

how to make max function in hive query to ignore _HIVE_DEFAULT_PARTITION__

I have a view which uses max to show the latest partition (which is of format 2021-01, 2021-02, 2021-03, 2021-04). The hive table has _HIVE_DEFAULT_PARTITION__ too. When we run the query in Impala, max on partitions gives the correct value of…

hadoop hive max impala hive-partitions

asked May 13 '21 at 02:19

Neal Pressley

vote

1 answer

Will HIVE do a full table query with both partition conditions and not partition conditions?

I have a hive table partitioned by one date column name datetime If I do a query like select * from table where datetime = "2021-05-01" and id in (1,2) with extra and id in (1,2) condition, will hive do a full table search? Is it possible to…

hadoop hive hiveql explain hive-partitions

asked May 10 '21 at 06:50

theotheo

vote

1 answer

Hive Update partition vs MSCK Repair

I have a table with thousands of partition. I want to change all the partition location to diff cluster. Ex: for table test_table and partition day=2021041600 Old location: hdfs://cluster1/dir1/dir2/day=2021041600/\ New location:…

hive bigdata partition hive-partitions hiveddl

asked Apr 16 '21 at 06:42

Sumit Singh

15,743
6
59
89

Prev 1 2 3

…

9 10 Next