Questions tagged [hive-serde]

SerDe is short for Serializer/Deserializer, an interface used by Hive for both serialization and deserialization during IO and also interpreting the results of serialization as individual fields. A SerDe allows Hive to read in data from a table, and write it back out to HDFS in any custom format. Anyone can write their own SerDe for their own data formats.

Official documentation page: SerDe

There are many SerDe bundled with Hive as well as third-party SerDe, such as:

  • LazySimpleSerDe
  • OpenCSVSerDe
  • RegexSerDe
  • JsonSerDe
  • AvroSerDe
  • ParquetHiveSerDe
  • OrcSerDe
  • MultiDelimitSerDe
164 questions
0
votes
0 answers

How to load zipped csv files in to hive table?

I have bunch of csv files listed inside zipped file in hdfs. Is there any way to create a hive table above those with right data? Note: data is quoted with " in csv file.
A srinivas
  • 791
  • 2
  • 9
  • 25
0
votes
1 answer

How can I SERDE to build generic file ingestion into Hive?

I need to build generic file ingestion into Hive. The files are very large (2GB+), can be fixed or comma-separated, ASCII or EBCDIC files. After trying various techniques using Talend, I am looking into SERDE. If I ingest the files as-is and use a…
Ranga Nathan
  • 81
  • 1
  • 4
0
votes
0 answers

Hive XML Serde - boolean xpath doesn't parse

I'm creating a simple hive table using this XML Serde, but it's throwing an exception when trying to parse the XPath below. I've tried to use the VTD and Javax processor for the following xpath: column.xpath.is_application=/Msg/Header/Type='APP' but…
darkCode
  • 140
  • 8
0
votes
0 answers

hive table to view the avro records which is streamed using flume getting Block size invalid or too large for this implementation: -40

I am creating the hive serde external table to view the twitter records which is streaming using flume. My property file # Naming the components on the current agent. TwitterAgent.sources = Twitter TwitterAgent.channels = MemChannel…
SARANYA
  • 115
  • 2
  • 11
0
votes
0 answers

Failed to read external resource while querying on json dataset in hive

I created external table, then loading json file format into table.I have added json jar into hive. hive> create database if not exists databasename; hive> use databasename; hive> create external table if not exists table-name(col1 string) row…
Ravi Anand
  • 47
  • 1
  • 3
  • 12
0
votes
0 answers

Getting extra nulls when loading data into hive table while using regex delimiter

I have the following 5 lines of data in a file on hdfs. I want to load this to a table. I have regex that will do it, but it is loading an extra row of nulls for each line of data. Does anyone know why this is happening? 19/Mar/2018 3:00:06 INFO…
Micah Pearce
  • 1,805
  • 3
  • 28
  • 61
0
votes
1 answer

unable to cast columns in hive

I used serde to load an csv file into hive table. As usual it created all the columntypes as string. But when i tried to cast the columns to their respective datatype it throws an error especially while converting the string type to array…
parthip c
  • 11
  • 3
0
votes
1 answer

How to deserialize the ProtoBuf serialized HBase columns in Hive?

I have used ProtoBuf's to serialize the class and store in HBase Columns. I want to reduce the number of Map Reduce jobs for simple aggregations, so I need SQL like tool to query the data. If I use Hive, Is it possible to extend the…
0
votes
2 answers

does hive allows column name as "rows"?

I know every hive version has some reserve keywords, which cant be used as a column name. But the problem is my data comes from a json, and my column name are according to the json values. And I cant modify the data off course. Is there any…
user3123372
  • 704
  • 1
  • 10
  • 26
0
votes
1 answer

Unable to map the HBase row key in HIve table effectively

I have a HBase table where the rowkey looks like this. 08:516485815:2013 1 06:260070837:2014 1 00:338289200:2014 1 I create a Hive link table using the below query. create external table hb (key string,value string) stored by…
Alex Raj Kaliamoorthy
  • 2,035
  • 3
  • 29
  • 46
0
votes
1 answer

CSV Serde format in Hive for different value types in table

A CSV file contain survey of user in below messy format and contain many different data types as string, int, range. China, 20-30, Male, xxxxx, yyyyy, Mobile Developer; zzzz-vvvv; "$40,000-50,000", Consulting Japan, 30-40, Female, xxxxx, ,…
Akash Tyagi
  • 97
  • 2
  • 15
0
votes
0 answers

How to load data in hive with two delimiters

I have sample record with format 9220216686,2011-05-05 22:48:26,28,C,PRE_HOST10_JINGLE_PP-PREF_WELCOME_PP-PREF_PROMO_PP|M001:6|M487:8|M312:3|M183:3|M093,CD,49, I want to load the data into hive based on both , and | delimiters. I searched and came…
0
votes
1 answer

spark use hive custom serde for JSON but class not found

Following along with https://github.com/Esri/gis-tools-for-hadoop/wiki/Aggregating-CSV-Data-%28Spatial-Binning%29 but on spark the classes for the serde are not found. ClassNotFoundException: Class com.esri.hadoop.hive.serde.JsonSerde not found My…
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
0
votes
1 answer

Ingesting from Existing Table string field in Serde

I'm looking to parse out a json string in HIVE using Serde, but don't see an easy way of doing so from a string already in HIVE tables. Do you know how I can do this? To make my scenario more understandable, here is a butchered example I may…
user2740775
  • 77
  • 1
  • 6
0
votes
1 answer

Handling line breakers on Hive table column data

I am trying to create an external hive table on existing avro files. Below is the query. CREATE EXTERNAL TABLE sample ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT…