Questions tagged [apache-tez]

The Apache Tez project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data.

The Apache Tez project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN

See Hive-on-Tez configuration properties.

192 questions
0
votes
1 answer

Data ingest issues hive: java.lang.OutOfMemoryError: unable to create new native thread

I'm a hive newbie and having an odyssey of problems getting a large (1TB) HDFS file into a partitioned Hive managed table. Can you please help me get around this? I feel like I have a bad config somewhere because I'm not able to complete reducer…
Zafar
  • 1,897
  • 15
  • 33
0
votes
1 answer

Performance issues of small files on Hive

I was reading an article regarding how small files degrade the performance of the hive query. https://community.hitachivantara.com/community/products-and-solutions/pentaho/blog/2017/11/07/working-with-small-files-in-hadoop-part-1 I understand the…
Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137
0
votes
1 answer

Configuring large Hive import job

I am a newbie and trying to take a large (1.25 TB uncompressed) hdfs file and put it into a Hive managed table. It is already on HDFS in csv format (from sqoop) with an arbitrary partition and I am putting it into a more organized format for…
Zafar
  • 1,897
  • 15
  • 33
0
votes
1 answer

To speed up hive process, how to adjust mapper and reducer number using tez

I tried the process(word labeling of sentence) of large data(about 150GB) using tez , but the problem is that it took so much time(1week or more),then I tried to specify number of mapper. Though I set mapred.map.tasks =2000, but I can't stop mapper…
0
votes
1 answer

Hive CLI and Beeline jdbc:hive2 behave differently in execution engine tez for insert million records?

When executing an insert into an empty table from a large table with millions of records( 20GB size). The execution is different in hive CLI and beeline. Hive CLI: It creates two TEZ jobs in Yarn, maybe mapper and reducer and completes in approx…
manj
  • 11
  • 4
0
votes
2 answers

TezTask vertex Failure on Amazon EMR over s3

I have created Hive table over EMR which look like create external table tests3( transaction_id String, order_id String, user_id String, amount String, subscriber_number String, product_type String, provider String, region String, status…
0
votes
1 answer

Tez - DAGAppMaster - java.lang.IllegalArgumentException: Invalid ContainerId

I try to launch a mapreduce job, but I get an error while excuting the jobs in shell or in hive : hive> select count(*) from employee ; Query ID = mapr_20171107135114_a574713d-7d69-45e1-aa73-d4de07a3059b Total jobs = 1 Launching Job 1 out of 1…
0
votes
1 answer

Is there a way to add a constant value dynamically to all records returned in Hive?

I want to do the following query in Hive v1.2.1, where field_3 is queried from another table. select user_id, start_date, field_3 as stop_date from some_table; For every record returned, the value of field_3 is the same. The problem is that it is…
Jane Wayne
  • 8,205
  • 17
  • 75
  • 120
0
votes
2 answers

How to create small files while inserting data to hive ORC table using TEZ

I have tried few options but I have only seen config settings to merge small files to big files like below but not vice versa.I am looking to create files of size 150kb . set hive.merge.tezfiles=true; set hive.merge.smallfiles.avgsize=128000; set…
cheapcoder
  • 183
  • 1
  • 3
  • 12
0
votes
1 answer

Hive return no values if used with function

I have a strange problem with hive shell. I created a Hadoop system using Apache original packages. I use tez. To test the system I loaded the NY taxi data into hive without any problem. The data set has about 11 Million lines. If I do select…
midon
  • 1
  • 2
0
votes
0 answers

Java: Access job History server and application timeline server on kerberized hadoop cluster?

I have used kerberos rest template to access the job history server on kerberized hadoop but this code is throwing me an exception: KerberosRestTemplate kerberosRestTemplate = new…
0
votes
1 answer

hive on tez throws java.lang.NoSuchMethodError

I have deployed tez and configured hive to work on tez. Simple query fails on reducer phase. It throws this error: Status: Running (Executing on YARN cluster with App id application_1469020577348_0014) VERTICES STATUS TOTAL COMPLETED…
Immanuel Fredrick
  • 508
  • 3
  • 9
  • 20
0
votes
1 answer

When to use Hive engine MR and when to use TEZ?

Under what conditions is it preferable to use the Hive engine TEZ over MR? What are the pro's and con's of each?
Corey
  • 1,845
  • 1
  • 12
  • 23
0
votes
1 answer

Why occur holding of Tez Queue in HiveServer2?

I use Python and Thrift for running queries on Tez engine in separated Queue (Fair Scheduler) through HiveServer2. And some queries stopping on Choosing a session from the defaultQueuePool, but queue is empty. ... 15/12/07 12:57:11 INFO ql.Driver:…
0
votes
0 answers

Failed Pig Script returns exit 0 while batch processing

Pig script(Tez enabled) embedded in a shell wrapper returns exit code 0 or gracefully exits even if it throws an error. In case of a batch process, the task is supposed to error out and stop the process. But in this case all the downstream tasks…
1 2 3
12
13