EDIT - FYI
This question is also answered following pages:
- Oozie Hive action hangs and heart beats forever
- Oozie workflow hive action stuck in RUNNING
- Error on running multiple Workflow in OOZIE-4.1.0
I think Oozie is dumb !
Original Question
I'm using AWS EMR, with
- emr-5.4.0
- Hive 2.1.1
- Tez 0.8.4
- Oozie 4.3.0
I created following HiveQL
insert.sql
DROP TABLE IF EXISTS simple;
CREATE TABLE simple (
name STRING
);
INSERT INTO simple(
name
)
VALUES (
"Oozie!"
);
SELECT * FROM simple;
And, I exec following command:
From command line
$ hive -f insert.sql
Then, I received
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: false
OK
Time taken: 1.6 seconds
OK
Time taken: 0.333 seconds
Query ID = hadoop_20170404023311_0d18d091-8916-4e58-a7e5-dbc081d5f8ab
Total jobs = 1
Launching Job 1 out of 1
Waiting for Tez session and AM to be ready...
Status: Running (Executing on YARN cluster with App id application_1491267059312_0040)
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 5.89 s
----------------------------------------------------------------------------------------------
Loading data to table default.simple
OK
Time taken: 16.207 seconds
OK
Oozie!
Time taken: 0.092 seconds, Fetched: 1 row(s)
From command line, this works. However, the process remains. What is the cause ? Please give me suggestion.
From Hue with Oozie
I realized raw hive query is working (but very slow, and remaining the process). It seems queries submit by Hue+Oozie are hanging (Progress stopped 95%).
$ yarn application -list
17/04/04 03:03:54 INFO impl.TimelineClientImpl: Timeline service address: http://ip-172-38-21-67.ap-northeast-1.compute.internal:8188/ws/v1/timeline/
17/04/04 03:03:54 INFO client.RMProxy: Connecting to ResourceManager at ip-172-38-21-67.ap-northeast-1.compute.internal/172.38.21.67:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):2
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1491267059312_0039 HIVE-a3ea64b2-105f-4b24-b89d-f0359eefbd3e TEZ hue default ACCEPTED UNDEFINED 0% N/A
application_1491267059312_0038 oozie:launcher:T=hive2:W=Create_sdata_item_master:A=hive-3a99:ID=0000016-170404005550013-oozie-oozi-W MAPREDUCE hue default RUNNING UNDEFINED 95% http://ip-172-38-21-43.ap-northeast-1.compute.internal:33037
I also tried yarn logs -applicationId <id>
, but there is no directory for yarn logs.
$ yarn logs -applicationId application_1491267059312_0038
$ sudo ls /var/log/hadoop-yarn/apps/hadoop/
ls: cannot access /var/log/hadoop-yarn/apps/hadoop/: No such file or directory