Highest Voted 'hadoop-partitioning' Questions

1

vote

0 answers

How to find list of all Hive tables in a database that are missing compute stats?

As part of my current project, we deployed 100+ hive tables. I am trying to find list of all hive tables in a particular database that are missing compute stats. For an individual table, I used SHOW PARTITIONS table_name. Is there anyway I can find…

asked Aug 06 '19 at 14:03

vvazza

421
7
21

1

vote

1 answer

How to merge existing hourly partitions to daily partition in hive

My requirement is to merge existing hourly partitions to daily partition for all days. My partition column is like: 2019_06_22_00, 2019_06_22_01, 2019_06_22_02, 2019_06_22_03..., 2019_06_22_23 => 2019_06_22 2019_06_23_00, 2019_06_23_01,…

merge hive partitioning hadoop-partitioning hive-partitions

asked Jun 25 '19 at 03:54

bala chandar

99
6

1

vote

1 answer

unable to access hive table in impala

Unable to access hive table in Impala which has partition create on a date column. The data is inserted using dynamic partition column option. Now date datatype is not supported in impala. what i should do to access this table in impala. Is there…

hive impala hadoop-partitioning hive-partitions

asked May 19 '19 at 19:26

Umer

25
5

1

vote

0 answers

Hive Partition By dynamic value in s3 file name

Assuming an S3 location with required data is of the form: s3://stack-overflow-example/v1/ where each file title in v1/ is of the form francesco_{YYY_DD_MM_HH}_totti.csv and each csv file contains a unix timestamp as a column in each row. Is it…

hadoop hive partitioning hadoop-partitioning

asked Mar 01 '19 at 15:47

pippa dupree

155
1
10

1

vote

0 answers

Generate unique id in MapReduce

I'm comparing two files A & B and extracting columns from A which don't exists in B and adding it to B. When new record is added to B , it should be given an unique id. I'm looking for logic where I can get the total count from B , which is the max …

hadoop mapreduce hadoop2 hadoop-partitioning

asked Dec 03 '18 at 21:12

user2316771

111
1
1
11

1

vote

0 answers

No Hash Partitioning when using repartition in spark

The spark doc says that .repartition() returns a new DataFrame, which is by default Hash-Partitioned. But, in the example I am running, as shown below, that's not the case. rdd=sc.parallelize([('a',22),('b',1),('c',4),('b',1),('d',2), …

python apache-spark dataframe rdd hadoop-partitioning

asked Nov 21 '18 at 15:40

cph_sto

7,189
12
42
78

1

vote

1 answer

How do you add partitions to a partitioned table in Presto running in Amazon EMR?

I'm running Presto 0.212 in EMR 5.19.0, because AWS Athena doesn't support the user defined functions that Presto supports. I'm using EMR configured to use the glue schema. I have pre-existing Parquet files that already exist in the correct…

hive amazon-emr parquet presto hadoop-partitioning

asked Nov 13 '18 at 18:36

Eddie

53,828
22
125
145

1

vote

1 answer

How does Hive partition works

Lets assume the below table: as schema: ID,NAME,Country and my partition key is country. If my query is like: select * from table where id between 155555756 to 10000000000; The partition will not work in that case, right? . On a simple note…

hadoop hive hadoop-partitioning

asked Oct 28 '18 at 16:51

Varshini

69
10

1

vote

1 answer

Received the following error while running a hive query. What could be the possible reasons for it?

java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1538324912862_7122_1_00, diagnostics=[Task…

hadoop hive hiveql hadoop-partitioning

asked Oct 19 '18 at 12:19

Ankit

103
2
12

1

vote

1 answer

Spark sortMergeJoin running continuously

I am joining two dataframes, but the join is not completing and running many hours. Due to this 1 task is running continuously although 199 tasks are completed within seconds. I tried, repartition and changing the right and left dataframes as well.…

apache-spark apache-spark-sql query-performance hadoop-partitioning

asked Aug 11 '18 at 17:19

Varun

33
1
8

1

vote

1 answer

Alternative to the default hashpartioner provided with hadoop

I have a hadoop MapReduce program that distributes keys unevenly. Some reducers end up with two keys, some with one key, and some with none. how do I force hadoop to distribute each partition with a certain key to a separate reducer. I have nine…

hadoop hash hadoop-partitioning

asked Apr 20 '18 at 03:02

zaranaid

65
1
13

1

vote

1 answer

Inserting Partitioned Data into External Table in Hive

I needed few clarification regarding inserting data into External Table. I have created an external parquet table, which is partitioned by week pointing to a hadoop location, after this I have moved the data (a .csv file) to that location. My doubt…

hadoop hive hadoop-partitioning external-tables

asked Feb 01 '18 at 06:05

av abhishiek

647
2
11
26

1

vote

1 answer

Hadoop-Installation-Multinode

Hi all I am trying to install the multinode hadoop installation. Everything works fine but my nodemanager for yarn is not working. When I looked at the log file for Yarn nodemanager, I got following…

hadoop hadoop2 hadoop-streaming hadoop-partitioning

asked Sep 06 '17 at 00:47

buildengineer

11
3

1

vote

1 answer

Name clash of getPartition of type Partitioner has the same erasure of type main class in MapReduce, Hadoop

I was trying to write a code that I can customize the Input will go to the reducer according to the length of the character using implementing to the Partition where default Mapper and Reducer, but the following error is coming. I will be thankful…

java hadoop mapreduce hadoop-yarn hadoop-partitioning

asked Aug 30 '17 at 05:49

user8331236

1

vote

2 answers

How data is split into part files in sqoop

I've a doubt how the data is partitioned into part files if the data is skewed. If possible, please help me clarifying this. Let's say this my department table with department_id as primary key. mysql> select * from departments; 2 Fitness 3…

hadoop sqoop hadoop-partitioning

asked Jul 14 '17 at 10:17

iamteja

11
1
5

Questions tagged [hadoop-partitioning]