I'm trying to use a hive metastore with shark-0.9.1 (hive-0.11.0). For now, I'd be happy getting it running on a single node, so no slavery involved. When running hive, I can create tables and execute SQL statements such as
hive> SELECT MAX(rating) FROM data;
When using spark, pretty much the only thing that works is
shark> show tables;
which shows the tables previously created with hive.
Any other statement like the SELECT one above gives me an error.
Exception in thread "main" java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$CompleteRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
(and much more "at java.lang....").
I also noted that when initialising shark, I get the following message:
1.998: [GC (Metadata GC Threshold) 996276K->19001K(10049024K), 0.0283650 secs]
2.026: [Full GC (Metadata GC Threshold) 19001K->18119K(10049024K), 0.0519489 secs]
Reloading cached RDDs from previous Shark sessions... (use -skipRddReload flag to skip reloading)
3.225: [GC (System.gc()) 653092K->31516K(10049024K), 0.0184714 secs]
3.244: [Full GC (System.gc()) 31516K->18363K(10049024K), 0.0909512 secs]
3.340: [GC (System.gc()) 187300K->18498K(10049024K), 0.0040080 secs]
3.344: [Full GC (System.gc()) 18498K->15265K(10049024K), 0.0836514 secs]
Any ideas what could be the reason for these problems? I should add that I'm new to this, so it could be some very basic thing I missed.