0

I used serde to load an csv file into hive table. As usual it created all the columntypes as string. But when i tried to cast the columns to their respective datatype it throws an error especially while converting the string type to array type.

describe table ted; 
comments string from deserializer
description string from deserializer
duration string from deserializer
speaker string from deserializer
occupation string from deserializer
tags string from deserializer
views string from deserializer

create table tedx as select cast(comments as int) as comments, cast(description as string) as desc, cast(duration as int) as duration, cast(speaker as string) as speaker, cast(occupation as string) as occupation, cast(tags as array) as tags, cast(views as int) as views, from ted;

FAILED: ParseException line 7:13 cannot recognize input near 'array' '<' 'string' in primitive type specification

How to convert the tags column to array type from string type?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
parthip c
  • 11
  • 3

1 Answers1

0

To convert string to array use (string str, string pat) - Splits str around pat (pat is a regular expression).

Demo:

hive> select split('1,2,3',',');
OK
["1","2","3"]
Time taken: 4.691 seconds, Fetched: 1 row(s)

The doc is here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

leftjoin
  • 36,950
  • 8
  • 57
  • 116