Questions tagged [hive-serde]

SerDe is short for Serializer/Deserializer, an interface used by Hive for both serialization and deserialization during IO and also interpreting the results of serialization as individual fields. A SerDe allows Hive to read in data from a table, and write it back out to HDFS in any custom format. Anyone can write their own SerDe for their own data formats.

Official documentation page: SerDe

There are many SerDe bundled with Hive as well as third-party SerDe, such as:

  • LazySimpleSerDe
  • OpenCSVSerDe
  • RegexSerDe
  • JsonSerDe
  • AvroSerDe
  • ParquetHiveSerDe
  • OrcSerDe
  • MultiDelimitSerDe
164 questions
2
votes
1 answer

PySpark/Hive: how to CREATE TABLE with LazySimpleSerDe to convert boolean 't' / 'f'?

Hello dear stackoverflow community, here is my problem: A) I have data in csv with some boolean columns; unfortunately, the values in these columns are t or f (single letter); this is an artifact (from Redshift) that I cannot control. B) I need to…
Vlad K.
  • 300
  • 3
  • 11
2
votes
1 answer

how to separate columns in hive

I have a file: id,name,address 001,adam,1-A102,mont vert 002,michael,57-D,costa rica I have to create a hive table which will contain three columns : id, name and address using comma delimited but here the address column itself contains comma in…
Sanskar Suman
  • 63
  • 1
  • 12
2
votes
2 answers

csv file to hive table using load data - How to format the date in csv to accept by hive table

I am using load data syntax to load a csv file to a table.The file is same format as hive accepts. But still after load data is issued, Last 2 columns returns null on…
sjd
  • 1,329
  • 4
  • 28
  • 48
2
votes
1 answer

In Hive ,how to specify semi-column delimiters for struct data types with in custom delimiters serde2

I am trying to create table as below. CREATE TABLE r_test (foo INT, bar STRING, address STRUCT) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES…
sande
  • 567
  • 1
  • 10
  • 24
2
votes
0 answers

AWS EMR Hive fails due to serde2/serde

I am running EMR hive query on S3 and it fails saying "Map operator initialization failed" I tried to set HADOOP_CLASSPATH as below, still no luck. set HADOOP_CLASSPATH=/usr/lib/hive/lib/*; Also, adding below jar, add jar…
2
votes
2 answers

How do I import an array of data into separate rows in a hive table?

I am trying to import data in the following format into a hive table [ { "identifier" : "id#1", "dataA" : "dataA#1" }, { "identifier" : "id#2", "dataA" : "dataA#2" } ] I have multiple files like this and I…
shrewquest
  • 541
  • 1
  • 7
  • 22
2
votes
2 answers

HIVE 2.1.1 Table creation CSV-Serde

So I did all the research and couldn't see the same issue anywhere in HIVE. Followed the link below and I have no issues with data in quotes.. https://github.com/ogrodnek/csv-serde My external table creation has the below serde properties,but for…
user8577513
2
votes
0 answers

What does the fieldID in Hive SerDe interface StructField stands for?

In order to implement a SerDe I use StructField implementation. I've upgraded the hive version and now the interface has the getFieldID method. What does this method stands for? Any special guidance for the implementation of it?
Vitali Melamud
  • 1,267
  • 17
  • 40
2
votes
2 answers

Hive: PARTITIONED BY and SERDEPROPERTIES

I am trying to create a hive table with partition by a single field. The data that i wanted to process is log data. Format of log is: DATE TIME IPAddress HTTP_METHOD MESSAGE Create table hive query: CREATE EXTERNAL TABLE…
Sampath
  • 45
  • 2
  • 6
2
votes
2 answers

Hive Json SerDE for ORC or RC Format

IS It possible to use a JSON serde with RC or ORC file formats? I am trying to insert into a Hive table with file format ORC and store on azure blob in serialized JSON.
Ravi Shastri
  • 57
  • 1
  • 2
  • 12
2
votes
0 answers

Hive json serde - Keys with white spaces

I am facing issue with space in key name in the struct type while creating a table. Following is the create table command I am using CREATE TABLE event_test ( android_id string, app string, app_ver string, at string, birth_date int, …
Dutta
  • 663
  • 2
  • 11
  • 30
1
vote
0 answers

kafka stream with custom AVRO Serde (without schema

I have a stream processing application using AVRO message format. For serialization and deserialization (Serde) it is using io.confluent.kafka.streams.serdes.avro.GenericAvroSerde. I was trying to create custom AVRO Serde as something like…
1
vote
1 answer

What format applies to the Hive LazySimpleSerDe

What exactly is the format for Hive LazySimpleSerDe? A format like ParquetHiveSerDe tells me that Hive will read the HDFS files in parquet format. But what is LazySimpleSerDe? Why not call it something explicit like CommaSepHiveSerDe or…
Victor
  • 16,609
  • 71
  • 229
  • 409
1
vote
1 answer

Multiple escape characters for hive create table

I am trying to load a csv with pipe delimiter to an hive external table. The data values contain single quote, double quotes, brackets etc.. Using Open CSV version 2.3 testfile.csv id|name|phone 1|Rahul|123 2|Kumar's|456 3|Neetu"s|789 4|Ravi…
Khilesh Chauhan
  • 739
  • 1
  • 10
  • 36
1
vote
1 answer

Hive external table read json as textfile

I'm trying to create a hive external table for a json file in .txt format. I have tried several approaches but I think I'm going wrong in how the hive external table should be defined: My Sample JSON is: [[ { "user": "ron", "id": "17110", …
user2441441
  • 1,237
  • 4
  • 24
  • 45
1
2
3
10 11