0

I'm using kylin. It is a data warehouse tool and it uses hadoop, hive and hbase. It is shipped with sample data so that we can test the system. I was building this sample. It is a multi-step process many of the steps are map-reduce jobs. Second step is Extract Fact Table Distinct Columns which is a MR job. This job is failing without writing anything in hadoop logs. After digging deeper I find one Exception in logs/userlogs/application_1450941430146_0002/container_1450941430146_0002_01_000004/syslog

2015-12-24 07:31:03,034 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
    at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
    ... 8 more

2015-12-24 07:31:03,037 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task

My question is should I copy all dependencies jar of mapper class to all hadoop node? This job succeeds if I restarts kylin server and resume cube building job. This behavior is observed again when restart it after cleaning up everything.

I am using 5 node cluster, each node is 8 core and 30GB. NameNode is running on one node. DataNode is running on all 5 nodes. For Hbase; HMaster and HQuorumPeer is running on same node as NameNode and HRegionServer is running on all nodes. Hive and Kylin are deployed on Master Node.

Version information:

Ubuntu 12.04 (64 bit)
Hadoop 2.7.1
Hbase  0.98.16
Hive   0.14.0
Kylin  1.1.1
Community
  • 1
  • 1
Aryaveer
  • 943
  • 1
  • 12
  • 27
  • *"without writing anything in hadoop logs"* -- you mean that the command `yarn logs -applicationId application_1450941430146_0002` gives no result? >> see http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/ section "Log-aggregation in YARN" – Samson Scharfrichter Dec 24 '15 at 10:11
  • @SamsonScharfrichter Output of the command is --> /tmp/logs/hduser/logs/application_1450941430146_0002 does not exist. Log aggregation has not completed or is not enabled. – Aryaveer Dec 24 '15 at 10:14
  • *"5 node cluster ... resume job ... succeeds ... fails again"* -- did you track on which nodes the job throws ClassNotFoundException? In other words, do you have a specific node with specific configuration issues? – Samson Scharfrichter Dec 24 '15 at 10:19
  • Look at this JIRA to see if it applies to your issue: https://issues.apache.org/jira/browse/KYLIN-1021 – Samson Scharfrichter Dec 24 '15 at 10:26
  • When the job fails I see Exception on every node but Master node. I have same configurations on all nodes. – Aryaveer Dec 24 '15 at 11:08
  • My issue is similar to issues.apache.org/jira/browse/KYLIN-1021 Does it mean that I have wait until release 1.2? – Aryaveer Dec 24 '15 at 11:09
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/98905/discussion-between-aryaveer-and-samson-scharfrichter). – Aryaveer Dec 24 '15 at 11:12

2 Answers2

1

The issue here is Kylin assumes the same Hive jars on all Hadoop nodes. And when certain node missing the Hive jars (or even in different location), you get the ClassNotFoundException on HCatInputFormat.

Btw, you should be able to get a clear error message from Yarn job console. This is a met issue.

Deploying Hive to all cluster nodes can surely fix the problem, like you have tried.

Or another (cleaner) workaround is manually configure Kylin to submit Hive jars as additional job dependencies. See https://issues.apache.org/jira/browse/KYLIN-1021

Finally there's also a open JIRA suggests that Kylin should submit Hive jars by default. See https://issues.apache.org/jira/browse/KYLIN-1082

Li Yang
  • 284
  • 1
  • 6
  • hive is a client. It is deployed on single node. If these the job has dependencies on jar available in hive then either It should be a part of kylin installation document or the job must submit these jars as pointed by jira ticket. – Aryaveer Jan 04 '16 at 11:41
0

What Li Yang suggested is correct. Here is my sequence of experiments. Kylin ships a jar file, specified by kylin.job.jar property in kylin.properties. So first I made a fat jar file with missing dependency, set path of this file in kylin.job.jar and ran the job again. The missing class was now shipped with MR job. The newly added dependencies had some dependencies which were not available on all nodes hence the job failed again. In next iteration I added those missing dependencies and tried again. Same result: newly added dependencies had some more dependencies which were not available on all nodes. Finally I extracted all classes available in $HIVE_HOME/lib/hive-*.jar and created a new jar. This time it worked. Size of the jar file was over 20MB. Since these jar files are used every time I run a cube job I coppied all these jar files to all the nodes under $HADOOP_HOME/share/hadoop/common/hive-metastore-0.14.0.jar.

I think kylin-job-1.1.1-incubating.jar should be created to include all its dependencies.

Aryaveer
  • 943
  • 1
  • 12
  • 27