Questions tagged [hive-serde]

SerDe is short for Serializer/Deserializer, an interface used by Hive for both serialization and deserialization during IO and also interpreting the results of serialization as individual fields. A SerDe allows Hive to read in data from a table, and write it back out to HDFS in any custom format. Anyone can write their own SerDe for their own data formats.

Official documentation page: SerDe

There are many SerDe bundled with Hive as well as third-party SerDe, such as:

  • LazySimpleSerDe
  • OpenCSVSerDe
  • RegexSerDe
  • JsonSerDe
  • AvroSerDe
  • ParquetHiveSerDe
  • OrcSerDe
  • MultiDelimitSerDe
164 questions
0
votes
1 answer

Athena DDL statement for different data structures

I have data in XML form which I have converted in JSON format through glue crawler. The problem is in writing the DDL statement for a table in Athena as you can see below there is a contact attribute in JSON data. Somewhere it is a structure (single…
0
votes
1 answer

Hive UDF : Generic UDF cannot access struct from nested map

here is my hive table create table if not exists dumdum (val map>>); insert into dumdum select map('A',map('1',named_struct('student_id','123a', 'age',11))); insert into dumdum select…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
0
votes
0 answers

Hive SerDe with "\u0000" as delimiter - can't get it to work

I have a dataset similar to this: SerDe sits on top of S3 location, and looks something similar to this: CREATE EXTERNAL TABLE `default.ga_serde_test`( column1 string,column2 string ) ROW FORMAT SERDE …
user2518751
  • 685
  • 1
  • 10
  • 20
0
votes
1 answer

Creating Athena table with escape character before separator

I am creating table in Athena from data in s3. Here is short version of what query looks like. CREATE EXTERNAL TABLE `tablename`( `licensee_pub` string COMMENT 'from deserializer', `admin_number` string COMMENT 'from deserializer', …
nik
  • 3,688
  • 3
  • 21
  • 33
0
votes
1 answer

I am only getting one row when using JSON SERDE to read a json file

JSON Data: [{ "liked": "true", "user_id": "101", "video_end_type": "3", "minutes_played": "3", "video_id": "101", "geo_cd": "AP", "channel_id": "11", "creator_id": "101", "timestamp": "07/05/2019 01:36:35", "disliked":…
0
votes
1 answer

Athena Issue - query string in lowercase

I've table that contains JSON column_A. Instead of setting column_A to a struct, I set column_A as a string to query JSON. The problem is when I query column_A I receive the data in lowercase. CREATE EXTERNAL TABLE `table_test`( `userid` string…
DonSaada
  • 767
  • 2
  • 8
  • 18
0
votes
1 answer

Way to create external hive table from ORC File

I am trying to create External Hive Table on ORC File. Query used to create the table: create external table fact_scanv_dly_stg ( store_nbr int, geo_region_cd char(2), scan_id int, scan_type char(2), debt_nbr string, mds_fam_id string, upc_nbr…
0
votes
1 answer

Create hive table using regex or json SerDe when root is an array

My data is in this format [{"field1":"data1","field2":100,"field3":"more data1","field4":123.001}] [{"field1":"data2","field2":200,"field3":"more data2","field4":123.002}] [{"field1":"data3","field2":300,"field3":"more…
Abhijeet Ahuja
  • 5,596
  • 5
  • 42
  • 50
0
votes
0 answers

Apache Beam and Hive complex types

I have an hive table with a column of type map
alexlipa
  • 1,131
  • 2
  • 12
  • 27
0
votes
1 answer

Create Hive external table dynamically with sqlcontext.sql(...)

I have a pyspark script in a Zeppelin notebook, which I point at a JSON file sitting in BLOB storage, in order to infer the JSON schema and create an external table in Hive. I can take the SQL command printed from the script, and execute it in a…
0
votes
1 answer

jagged csv serde for hive

Is there a way to define a Hive table with jagged header files. keep nulls for missing columns? I have tsv files with different header lines. i.e.: Name Age Height Chris 48 5'10" Jim 25 5'11" then another file Name Group Age Height Bill…
Chris Hayes
  • 3,876
  • 7
  • 42
  • 72
0
votes
1 answer

Double quotes are no getting removed even after using org.apache.hadoop.hive.serde2.OpenCSVSerde

I am having an external table with DDL as below : CREATE EXTERNAL TABLE pathirippilly_db.serdeTest (Name varchar(50),Job varchar(50),Sex varchar(4)) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( …
akhil pathirippilly
  • 920
  • 1
  • 7
  • 25
0
votes
1 answer

Unable to upload json file on hive using json serde

I am trying to load json file using json serde. I have added the serde jar file successfully. 1) My json jar file placed on path /apps/hive/warehouse/lib/ I have run this command successfully add jar…
user4440416
  • 71
  • 1
  • 11
0
votes
2 answers

Load json file in hive using json serde

I am trying to upload json file on hadoop using json serde. I have uploaded jar lib to hadoop but getting error while running hive command I have uploaded json serde jar file to /apps/hive/warehouse/lib path.Now, when i am tring to run this command…
user4440416
  • 71
  • 1
  • 11
0
votes
1 answer

How do I set the SerDe XML schema correctly?

I've got this XML:
user2518751
  • 685
  • 1
  • 10
  • 20