Questions tagged [hive-serde]

SerDe is short for Serializer/Deserializer, an interface used by Hive for both serialization and deserialization during IO and also interpreting the results of serialization as individual fields. A SerDe allows Hive to read in data from a table, and write it back out to HDFS in any custom format. Anyone can write their own SerDe for their own data formats.

Official documentation page: SerDe

There are many SerDe bundled with Hive as well as third-party SerDe, such as:

  • LazySimpleSerDe
  • OpenCSVSerDe
  • RegexSerDe
  • JsonSerDe
  • AvroSerDe
  • ParquetHiveSerDe
  • OrcSerDe
  • MultiDelimitSerDe
164 questions
1
vote
1 answer

Error in data while creating external tables in Athena

I have my data in CSV format in the below form: Id -> tinyint Name -> String Id Name 1 Alex 2 Sam When I export the CSV file to S3 and create an Athena table, the data transform into the following format. Id Name 1 "Alex" 2 …
AswinRajaram
  • 1,519
  • 7
  • 18
1
vote
2 answers

Handling Data with and Without double quotation marks In Hive

Can someone please guide me how should I Load data in hive where I am getting " in some rows and in some rows data is coming without " for the same column value. Sample Data: id,name,desc,uqc,roll,age 1,Monali,"abhc,jkjk",,23,23 …
Rahul Patidar
  • 189
  • 1
  • 1
  • 14
1
vote
2 answers

Error while trying to create external table in hive

I am trying to create an external table using hive with hadoop but somehow it failed. These are the error I get when I try to run my queries. 02:23:29.516 [HiveServer2-Background-Pool: Thread-39] ERROR hive.ql.exec.DDLTask -…
Lyn
  • 23
  • 3
1
vote
2 answers

hive standalone metastore reading avro data with schema not working

we have usecase of presto hive accessing s3 file present in avro format. When we try to use standalone hive-metastore and read this avro data using external table ,we are getting issue SerDeStorageSchemaReader class not found issue …
Vish
  • 867
  • 6
  • 19
  • 45
1
vote
1 answer

Unable to get AWS Athena escapeChar working

I am trying to take AWS Athena for a spin and I am running in to issues with the a csv file I am trying to test against. Using the following the escapeChar doesn't appear to be working. I have tried using the crawler and specifying the escapeChar in…
jaybee
  • 691
  • 1
  • 5
  • 8
1
vote
2 answers

Tableau does not read hive-serde jar in Databricks library

My project connects Tableau to Databricks using the SIMBA Spark ODBC driver. I am trying to read a HIVE table in the OpenCSVSerde format. The table has the below ROW format, INPUTFORMAT and OUTPUTFORMAT. ROW FORMAT SERDE…
PPawar
  • 11
  • 2
1
vote
1 answer

Hive Generic UDF : Hive does not cast as expected, Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.util.Map

I am trying to create a simple generic udf for my hive queries. Here is my hive table CREATE TABLE `dum`(`val` map>); insert into dum select map('A',array('1','2','3'),'B',array('4','5','6')); and here is how it looks select *…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
1
vote
1 answer

Error while serializing aggregate state store with custom serde on Spring Cloud Stream

I'm trying to create a simple functional bean with Spring Cloud Stream that processes messages from a KStream and a GlobalKTable, joins them, aggregates them, and outputs the result to a new stream but I'm having difficulties in configuring properly…
1
vote
1 answer

Access hive table with json serde via spark sql

I am new to SPARK world. In what way, a hive table with JSON serde could be read via spark sql. Any example piece of code or document would work.
1
vote
1 answer

Utility that will create an AWS Athena table definition from AWS Glue catalog so I can add a WITH SERDEPROPERTIES section

[Update: looks like a aws glue get-table --database-name xyz --name tablename will give me the raw materials for the table definition, so that's progress--just wondering if something exists that automatically assembles the pieces] [Update 2: You can…
DWright
  • 9,258
  • 4
  • 36
  • 53
1
vote
0 answers

How to stored null value in OpenCSVSerde or avoid quote char in LazySimpleSerde at Hive table

I have a question about TBLProperties in Hive for OpenCSVSerde and SimpleLazySerDe. Data file stored in text file (generated by SQOOP) Table properties Stored data as OpenCSVSerde separatorChar by | quoteChar by " escapeChar by \\ The problem is…
1
vote
1 answer

Loading to hive new line character from CSV File

We are having a file, which is of the following type: 1- Sam, Joshua , "52 DD dr, Lake Hiawatha" , New Jersey, 07034 2- Ruchi,kumari,SNN Raj serenity,Bengaluru, 560068 The line 1 is split into 2 rows in the External table with the rest of the…
Sam Berchmans
  • 127
  • 13
1
vote
1 answer

Unable to Parse string using Hive Regex Serde

I am trying to parse a string which is : "297","298","Y","","299" using Regexp serder but i am unable to do so. The Table definition i have created is : create external table test.test1 (a string, b string, c string, d string) row format serde…
TimesNow
  • 13
  • 3
1
vote
0 answers

Hive json serde selection

I am confused on choosing between two json serde given in below link ( Openx and hcatolog). https://docs.aws.amazon.com/athena/latest/ug/json.html My json is not a nested json.Its a simple json. A file having each record as json separated by…
1
vote
1 answer

regex for access log in hive serde with newline

With aws athena services, I try to import csv file including new line data Importing data uses hive serde format. If data is like this, (each data is enclosed in double quotes. "") "DataA"|"DataB"|"DataC" "Data1"|"Data2 with new…
1 2
3
10 11