Questions tagged [hive-udf]

Please use this tag for user defined functions (UDF) for apache hive.

Apache Hive is a database built on top of Hadoop that provides the following:

  • Tools to enable easy data summarization (ETL)
  • Ad-hoc querying and analysis of large datasets data stored in Hadoop file system (HDFS)
  • A mechanism to put structure on this data
  • An advanced query language called Hive Query Language which is based on SQL and some additional features such as DISTRIBUTE BY, TRANSFORM, and which enables users familiar with SQL to query this data.

How to write good Hive question:

  1. Add clear textual problem description.
  2. Provide query and/or table DDL if applicable
  3. Provide exception message
  4. Provide input and desired output data example
  5. Questions about query performance should include EXPLAIN query output.
  6. Do not use pictures for SQL, DDL, DML, data examples, EXPLAIN output and exception messages.
  7. Use proper code and text formatting

Official Website:

Useful Links:

64 questions
1
vote
1 answer

Hive UDF - exetremely slow when parsing IP addresses

I have a column which comprises ip addresses. Now I need to parse them to contries/cities: select IPUtils('199.999.999.999') and it returns ['Aisa', 'Hongkong', 'xxx', 'Hongkong'] I write a hive udf to do this but it runs exetremely slow, as shown…
user2894829
  • 775
  • 1
  • 6
  • 26
1
vote
1 answer

Hive TRANSFORM receives NULL for concatenated array values

I have a hive table in the format : col1. col2. col3. a1 b1 c1 a1 b1 c2 a1 b2 c2 a1 b2 c3 a2 b3 …
1
vote
2 answers

Hive UDF in Java fails when creating a table

What is the difference between those two queries: SELECT my_fun(col_name) FROM my_table; and CREATE TABLE new_table AS SELECT my_fun(col_name) FROM my_table; Where my_fun is a java UDF. I'm asking, because when I create new table (second query) I…
mc2
  • 393
  • 6
  • 15
1
vote
2 answers

How to reload the updated custom UDF function in Hive?

I wrote a custom UDF in java and packed in a jar file. Then, I added it in Hive using: create temporary function isstopword as 'org.dennis.udf.IsStopWord'; Every thing worked fine. But, after I updated a small part in the UDF, I did the previous…
DennisLi
  • 3,915
  • 6
  • 30
  • 66
1
vote
0 answers

Hive UDF only works in standalone select statement and not in "Create table as select..." or "insert into .. select.."

I have a Generic UDF which encrypts a given input value. This UDF gives the correct value when used in a select statement however the UDF returns null when used in "Create table as select" or "Insert into .. select" statement. Input to the udf,…
Sparrow
  • 41
  • 3
1
vote
1 answer

Hive UDF - Error in the evaluate() Method

I created the following Java class and added it to Hive after making a jar out of it import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public class MakeCap extends UDF{ private Text t; public Text evaluate(Text…
Amber
  • 914
  • 6
  • 20
  • 51
1
vote
1 answer

Hive UDF to return multiple colunm output

How create a UDF which take a String return multiple Strings ? The UDF so far I have seen could only give one output. How to get multiple feilds as output from a UDF ? Simplest would be implementation of name -> FirstName, LastName. Not looking for…
user2458922
  • 1,691
  • 1
  • 17
  • 37
1
vote
1 answer

Hive Python UDF

I am using this Python UDF script: import sys import collections import datetime import re try: for line in sys.stdin: line=line.strip() number,sd=line.split('\t') sd=sd.lower() sd=sd.split(' ') …
Venkataraman
  • 138
  • 1
  • 9
1
vote
1 answer

Processing multiple rows in an hive udf

How could I take many rows inside an hive-udf? I need an entire column name inside the function so that it could be added to an ArrayList inside UDF. The following is the column name: Name jhon jone mike I want to take all of the names in the…
1
vote
2 answers

Hive gives SemanticException [Error 10014]: when Running my UDF

I have a hive UDF that does a GeoIP lookup. public static Text evaluate(Text inputFieldName, Text option, Text databaseFileName) { String inputField, fieldOption, dbFileName, result = null; inputField = inputFieldName.toString(); …
Riyan Mohammed
  • 247
  • 2
  • 6
  • 20
1
vote
1 answer

HIVE UDF :RuntimeException Internal error: Cannot find ObjectInspector for UNKNOWN

I tried to create an hive UDF, which returns multiple results. longitude and latitude are arguments for UDF. When I run the function, I got "FAILED: RuntimeException Internal error: Cannot find ObjectInspector for UNKNOWN" error. Code: import…
Var
  • 11
  • 3
1
vote
0 answers

Call existing Java/Hive UDF in SparkContext without using HiveContext in Spark-SQL application

I have Spark 1.5.0 running on cluster. I want to use Hive UDF from ESRI's API. I can use these API in Spark Application but due to some issues in my cluster, I am not able to use HiveContext. I want to use Existing Hive UDF in Spark-SQL application.…
ChikuMiku
  • 509
  • 2
  • 11
  • 22
1
vote
1 answer

HiveUDF + saxon 9.1.0.8 + Java8 = failed to create an XPathFactory

My Spark job with HiveContext and Saxon working fine unless no UDFs defined in code. In case of UDF implementation - HiveContext initialization failed with error. I heard there are saxon\java8 incompability solved in saxon 9.5.1.5, which is not…
seaman29
  • 79
  • 1
  • 8
1
vote
0 answers

Alter Hive External table output having array column to support postgresql compatible csv file

I'm struggle to generate a postgresql compatible tsv format having array as hive-column-type using hive sql having external-table definition. With Hive I can specify to use delimiter/collection-item-termination to write array field in csv. however…
shahjapan
  • 13,637
  • 22
  • 74
  • 104
0
votes
0 answers

org.apache.hadoop.hive.ql.exec.UDF is deprecated

why hive3.x deprecate org.apache.hadoop.hive.ql.exec.UDF then I use org.apache.hadoop.hive.ql.udf.generic.GenericUDF to deal join job sql like select dw_rk.STRDEREPEAT(',', t1.d_deptname, t2.d_deptname, t3.d_deptname) as d_deptname, …
X-Hadrain
  • 11
  • 1