0

I generated a sequence file using hive and trying to import it in bigtable, my import job is failing with the error below.

2015-06-21 00:05:42,584 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1434843251631_0007_m_000000_1: Error: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.hbase.io.ImmutableBytesWritable
at com.google.cloud.bigtable.mapreduce.Import$Importer.map(Import.java:127)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

I am using the hive table definition and parameters below for generating the sequence file.

create table ilv_bigtable_test(
item_id int
)
stored as sequencefile
LOCATION 'gs://zxx/xx/aa1_temp/ilv_bigtable_test/'
TBLPROPERTIES('serialization.null.format'='')
;

SET hive.exec.compress.output=true;
SET mapred.max.split.size=256000000;
SET mapred.output.compression.type=BLOCK;

insert overwrite table ilv_bigtable_test
select 
item_id
FROM xxx
;

Below is the hbase create table statement

create 'test', 'item_id'
  • What version of Hive, HBase, Hadoop are you using? From the stack trace, it doesn't look like a Cloud Bigtable issue, have you tried it on pure HBase? – Les Vogel - Google DevRel Jun 22 '15 at 13:10
  • This looks like a bug in non-Cloud Bigtable code: Error: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.hbase.io.ImmutableBytesWritable – Max Jun 22 '15 at 17:32
  • We are on hive 0.13 on hadoop 2.4. Yes we were able to load same data on hbase using csv bulkload. Hadoop fs -text is giving proper results on the sequence file so not sure if this issue is related to bad file format hadoop fs -text gs://zzz/ssss/aa1_temp/ilv_bigtable_test/* – Manish Kumar Jun 22 '15 at 17:50
  • I am following documentation https://cloud.google.com/bigtable/docs/exporting-importing#export-hbase and it says that "Export a table, either from HBase or from Cloud Bigtable.", can't we import data exported using some other mechanism. Does it support import only for Hbase/Cloud exported dataset? – Manish Kumar Jun 23 '15 at 00:37

1 Answers1

1

You need a much later version of Apache Hive – 1.1.0 is the first version to support HBase 1.0 which Cloud Bigtable requires. You can try with Hive 1.2.1 at least.