0

I'm trying to use a hive metastore with shark-0.9.1 (hive-0.11.0). For now, I'd be happy getting it running on a single node, so no slavery involved. When running hive, I can create tables and execute SQL statements such as

hive> SELECT MAX(rating) FROM data;

When using spark, pretty much the only thing that works is

shark> show tables;

which shows the tables previously created with hive.

Any other statement like the SELECT one above gives me an error.

Exception in thread "main" java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$CompleteRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;

(and much more "at java.lang....").

I also noted that when initialising shark, I get the following message:

1.998: [GC (Metadata GC Threshold)  996276K->19001K(10049024K), 0.0283650 secs]
2.026: [Full GC (Metadata GC Threshold)  19001K->18119K(10049024K), 0.0519489 secs]
Reloading cached RDDs from previous Shark sessions... (use -skipRddReload flag to skip reloading)
3.225: [GC (System.gc())  653092K->31516K(10049024K), 0.0184714 secs]
3.244: [Full GC (System.gc())  31516K->18363K(10049024K), 0.0909512 secs]
3.340: [GC (System.gc())  187300K->18498K(10049024K), 0.0040080 secs]
3.344: [Full GC (System.gc())  18498K->15265K(10049024K), 0.0836514 secs]

Any ideas what could be the reason for these problems? I should add that I'm new to this, so it could be some very basic thing I missed.

helm
  • 713
  • 2
  • 16
  • 30
  • 1
    Sometime back, I faced an issue with accessing the tables from `shark-shell`. It seemed to be a problem with the Derby metastore and miraculously started working once I changed my metastore to MySQL. I would say it's worth giving a shot. – visakh Jun 06 '14 at 06:04
  • Thanks, it seems indeed that the Derby metastore is not very well useful for cluster installations. This particular problem is solved, but the next one appeared.. I'll keep working on that – helm Jun 11 '14 at 18:10
  • @HelmHammerhand can you post your solution here thanks. – hayat Sep 22 '14 at 11:46
  • I only remember changing to MySQL, which solved it for some time. But as far as I can see Shark is outdated anyway and the long-term solution is switching to SparkSQL. – helm Sep 22 '14 at 18:06

0 Answers0