1

we have usecase of presto hive accessing s3 file present in avro format. When we try to use standalone hive-metastore and read this avro data using external table ,we are getting issue SerDeStorageSchemaReader class not found issue

    MetaException(message:org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader class not found)
    at org.apache.hadoop.hive.metastore.utils.JavaUtils.getClass(JavaUtils.java:54)

We understand this error is coming because SerDeStorageSchemaReader class is not available in standalone-metastore.

i want to understand can be run hive-metastore without using hive/hadoop or there is any other option too?

Vish
  • 867
  • 6
  • 19
  • 45

2 Answers2

1

standalone hive doesnt support avro. we need to install full hadoop plus hive version and start only hive metastore to fix it

Vish
  • 867
  • 6
  • 19
  • 45
0

I managed to tweak Hive Standalone to work with Avro files and S3 by doing the following:

  1. In the metastore-site.xml file I added the following:

     <property>
     <name>metastore.storage.schema.reader.impl</name>
     <value>org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader</value> </property>
    
  2. I added the following jars to ${HIVE_HOME}/lib/

  • hive-metastore-${METASTORE_VERSION}.jar (full hive version)
  • hive-common-${METASTORE_VERSION}.jar
  • hive-serde-${METASTORE_VERSION}.jar
  1. I created the table like this:

    CREATE TABLE IF NOT EXISTS table_xyz (col1 INT, col2 INT) WITH (format = 'AVRO', partitioned_by = ARRAY['col1', col2], external_location = 's3a://my_bucket/path/blah', avro_schema_url = 's3a://mybucket/avro_file_schema.avsc');