Questions tagged [apache-tez]

The Apache Tez project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data.

The Apache Tez project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN

See Hive-on-Tez configuration properties.

192 questions

votes

1 answer

Data ingest issues hive: java.lang.OutOfMemoryError: unable to create new native thread

I'm a hive newbie and having an odyssey of problems getting a large (1TB) HDFS file into a partitioned Hive managed table. Can you please help me get around this? I feel like I have a bad config somewhere because I'm not able to complete reducer…

asked Sep 17 '18 at 20:08

Zafar

1,897
15
33

votes

1 answer

Performance issues of small files on Hive

I was reading an article regarding how small files degrade the performance of the hive query. https://community.hitachivantara.com/community/products-and-solutions/pentaho/blog/2017/11/07/working-with-small-files-in-hadoop-part-1 I understand the…

hadoop hive mapreduce hadoop2 apache-tez

asked Sep 11 '18 at 20:13

Gaurang Shah

11,764
9
74
137

votes

1 answer

Configuring large Hive import job

I am a newbie and trying to take a large (1.25 TB uncompressed) hdfs file and put it into a Hive managed table. It is already on HDFS in csv format (from sqoop) with an arbitrary partition and I am putting it into a more organized format for…

hadoop hive hortonworks-data-platform apache-tez

asked Sep 10 '18 at 16:19

Zafar

1,897
15
33

votes

1 answer

To speed up hive process, how to adjust mapper and reducer number using tez

I tried the process(word labeling of sentence) of large data(about 150GB) using tez , but the problem is that it took so much time(1week or more),then I tried to specify number of mapper. Though I set mapred.map.tasks =2000, but I can't stop mapper…

hadoop hive apache-tez

asked Aug 25 '18 at 03:15

Keito Tanki

votes

1 answer

Hive CLI and Beeline jdbc:hive2 behave differently in execution engine tez for insert million records?

When executing an insert into an empty table from a large table with millions of records( 20GB size). The execution is different in hive CLI and beeline. Hive CLI: It creates two TEZ jobs in Yarn, maybe mapper and reducer and completes in approx…

hadoop hive beeline apache-tez

asked Aug 08 '18 at 04:16

manj

votes

2 answers

TezTask vertex Failure on Amazon EMR over s3

I have created Hive table over EMR which look like create external table tests3( transaction_id String, order_id String, user_id String, amount String, subscriber_number String, product_type String, provider String, region String, status…

amazon-s3 hive mapreduce amazon-emr apache-tez

asked Dec 22 '17 at 06:14

user3459215

votes

1 answer

Tez - DAGAppMaster - java.lang.IllegalArgumentException: Invalid ContainerId

I try to launch a mapreduce job, but I get an error while excuting the jobs in shell or in hive : hive> select count(*) from employee ; Query ID = mapr_20171107135114_a574713d-7d69-45e1-aa73-d4de07a3059b Total jobs = 1 Launching Job 1 out of 1…

hadoop hive hadoop-yarn tez apache-tez

asked Nov 08 '17 at 13:40

Ayman Anikad

votes

1 answer

Is there a way to add a constant value dynamically to all records returned in Hive?

I want to do the following query in Hive v1.2.1, where field_3 is queried from another table. select user_id, start_date, field_3 as stop_date from some_table; For every record returned, the value of field_3 is the same. The problem is that it is…

sql hive mapreduce hiveql apache-tez

asked Oct 20 '17 at 04:02

Jane Wayne

8,205
17
75
120

votes

2 answers

How to create small files while inserting data to hive ORC table using TEZ

I have tried few options but I have only seen config settings to merge small files to big files like below but not vice versa.I am looking to create files of size 150kb . set hive.merge.tezfiles=true; set hive.merge.smallfiles.avgsize=128000; set…

hive orc apache-tez

asked Sep 14 '17 at 15:48

cheapcoder

votes

1 answer

Hive return no values if used with function

I have a strange problem with hive shell. I created a Hadoop system using Apache original packages. I use tez. To test the system I loaded the NY taxi data into hive without any problem. The data set has about 11 Million lines. If I do select…

hadoop hive apache-spark-sql hiveql apache-tez

asked Feb 28 '17 at 11:53

midon

votes

0 answers

Java: Access job History server and application timeline server on kerberized hadoop cluster?

I have used kerberos rest template to access the job history server on kerberized hadoop but this code is throwing me an exception: KerberosRestTemplate kerberosRestTemplate = new…

hadoop kerberos spring-security-kerberos apache-tez

asked Jan 19 '17 at 11:47

Jasvinder Singh

votes

1 answer

hive on tez throws java.lang.NoSuchMethodError

I have deployed tez and configured hive to work on tez. Simple query fails on reducer phase. It throws this error: Status: Running (Executing on YARN cluster with App id application_1469020577348_0014) VERTICES STATUS TOTAL COMPLETED…

hive apache-tez

asked Aug 03 '16 at 07:08

Immanuel Fredrick

votes

1 answer

When to use Hive engine MR and when to use TEZ?

Under what conditions is it preferable to use the Hive engine TEZ over MR? What are the pro's and con's of each?

hadoop mapreduce hive apache-tez tez

asked Jul 02 '16 at 00:38

Corey

1,845
1
12
23

votes

1 answer

Why occur holding of Tez Queue in HiveServer2?

I use Python and Thrift for running queries on Tez engine in separated Queue (Fair Scheduler) through HiveServer2. And some queries stopping on Choosing a session from the defaultQueuePool, but queue is empty. ... 15/12/07 12:57:11 INFO ql.Driver:…

hadoop hive apache-tez

asked Dec 07 '15 at 10:47

Анатолий Панин

votes

0 answers

Failed Pig Script returns exit 0 while batch processing

Pig script(Tez enabled) embedded in a shell wrapper returns exit code 0 or gracefully exits even if it throws an error. In case of a batch process, the task is supposed to error out and stop the process. But in this case all the downstream tasks…

hadoop apache-pig apache-tez

asked Nov 14 '15 at 21:13

Optimus Prime

Prev 1 2 3

…

13 Next