Questions tagged [hive-udf]

Please use this tag for user defined functions (UDF) for apache hive.

Apache Hive is a database built on top of Hadoop that provides the following:

  • Tools to enable easy data summarization (ETL)
  • Ad-hoc querying and analysis of large datasets data stored in Hadoop file system (HDFS)
  • A mechanism to put structure on this data
  • An advanced query language called Hive Query Language which is based on SQL and some additional features such as DISTRIBUTE BY, TRANSFORM, and which enables users familiar with SQL to query this data.

How to write good Hive question:

  1. Add clear textual problem description.
  2. Provide query and/or table DDL if applicable
  3. Provide exception message
  4. Provide input and desired output data example
  5. Questions about query performance should include EXPLAIN query output.
  6. Do not use pictures for SQL, DDL, DML, data examples, EXPLAIN output and exception messages.
  7. Use proper code and text formatting

Official Website:

Useful Links:

64 questions
0
votes
2 answers

Overridden evaluate methods in custom hive UDF

I am new to write custom udf for hive. I have tried writing custom udf for toupper function succecfully. import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDF; import…
Ashu
  • 1
  • 1
0
votes
0 answers

Dynamic (Changing) JSON to Hive Schema using UDF

I am having a JSON file with below structure: { "A": { "AId": { "AId": "123", "idType": "XYZ" }, "fN": "RfN", "oN": "ON", "mail": [ "abc@kml.com", …
A S
  • 3
  • 3
-1
votes
2 answers

how to read hive conf variables in UDF initialize method

I am trying to read a hive conf variable in initialize method, but not works, any suggestion plz? My UDF Class: public class MyUDF extends GenericUDTF { MapredContext _mapredContext; @Override public void configure(MapredContext…
Ranjith Sekar
  • 1,892
  • 2
  • 14
  • 18
-1
votes
1 answer

Hive - Possible to get total size of file parts in a directory?

Background: I have some gzip files in a HDFS directory. These files are named in the format yyyy-mm-dd-000001.gz, yyyy-mm-dd-000002.gz and so on. Aim: I want to build a hive script which produces a table with the columns: Column 1 - date…
activelearner
  • 7,055
  • 20
  • 53
  • 94
1 2 3 4
5