0

I am trying to insert data into a table in Hive I created. I’ve been struggling, so I’m trying to simplify it as much as possible to get to the root of the issue.

Here is my simplified code for creating a basic table. I basically have an array of structure with a single element.

DROP TABLE IF EXISTS foo.S_FILE_PA_JOB_DATA_T;

CREATE TABLE foo.S_FILE_PA_JOB_DATA_T
  PARTITIONED BY (customer_id string)
  STORED AS AVRO
  TBLPROPERTIES (
 'avro.schema.literal'=
 '{
   "namespace": "com.foo.oozie.foo",
   "name": "S_FILE_PA_JOB_DATA_T",
   "type": "record",
   "fields":
   [
      {"name":"pa_hwm"             ,"type":{
         "type":"array",
         "items":{
           "type":"record",
           "name":"pa_hwm_record",
           "fields":
           [
             {"name":"pa_axis"           ,"type":["int","null"]}
           ]
         }
      }}
   ]
   }');

My problem is I can’t figure out the syntax to insert into the table.

insert into table foo.s_FILE_PA_JOB_DATA_T partition (customer_id) values (0,'a390c1cf-4ee5-4ab9-b7a3-73f5f268b669')

The 0 needs to somehow be an array<struct<int>> but I can't get the syntax right. Can anyone help? Thanks!

user432209
  • 20,007
  • 10
  • 56
  • 75

1 Answers1

0

Unfortunately, you can't directly do that. See also Hive inserting values to an array complex type column.

In theory, you should be able to do it using something like

insert into table s_file_pa_job_data_t partition(customer_id)  
  values (array(named_struct('pa_axis',0)) );

that is, using the array() and named_struct() udfs which will from some scalar values construct respectively an array, and a struct according to your specs. (see UDF documentation here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-ComplexTypeConstructors

but unfortunately if you do that you'll get

FAILED: SemanticException [Error 10293]: Unable to create temp file 
for insert values Expression of type TOK_FUNCTION not supported in insert/values

because unfortunately hive does not support the use of UDF functions in the VALUES clause yet. As the other posts suggest, you could do it using a dummy table, which is kind of ugly, but works.

Community
  • 1
  • 1
Roberto Congiu
  • 5,123
  • 1
  • 27
  • 37