Questions tagged [hive-udf]

Please use this tag for user defined functions (UDF) for apache hive.

Apache Hive is a database built on top of Hadoop that provides the following:

  • Tools to enable easy data summarization (ETL)
  • Ad-hoc querying and analysis of large datasets data stored in Hadoop file system (HDFS)
  • A mechanism to put structure on this data
  • An advanced query language called Hive Query Language which is based on SQL and some additional features such as DISTRIBUTE BY, TRANSFORM, and which enables users familiar with SQL to query this data.

How to write good Hive question:

  1. Add clear textual problem description.
  2. Provide query and/or table DDL if applicable
  3. Provide exception message
  4. Provide input and desired output data example
  5. Questions about query performance should include EXPLAIN query output.
  6. Do not use pictures for SQL, DDL, DML, data examples, EXPLAIN output and exception messages.
  7. Use proper code and text formatting

Official Website:

Useful Links:

64 questions
0
votes
1 answer

Hive Udf for TOP funtion

We are joining tables from hana and hive and a view creating query from Smart Data Access EX: Select top 10 from hana.table join hive.table Hana support TOP funtion but Hive doesnt. Is there any existing UDF present in Hive similar to TOP. I know…
marjun
  • 696
  • 5
  • 17
  • 30
0
votes
1 answer

Getting Error when I ran hive UDF written in Java in pyspark EMR 5.x

I have a Hive UDF written in java and I am trying to use it in pyspark 2.0.0. below are the steps 1. Copy the jar file to EMR 2. started a pyspark job like below pyspark --jars ip-udf-0.0.1-SNAPSHOT-jar-with-dependencies-latest.jar 3. used the…
braj
  • 2,545
  • 2
  • 29
  • 40
0
votes
1 answer

CROSS APPLY SQL Server query on Hive

HDP-2.5.0.0 using Ambari 2.4.0.1 The Hive table ReportSetting is as follows : id int serializedreportsetting String The column 'serializedreportsetting' is an XML data type in the source SQL Server db but is converted to String during Sqoop…
Kaliyug Antagonist
  • 3,512
  • 9
  • 51
  • 103
0
votes
2 answers

Can anyone please provide hive udf code for truncating particular column?

I have a column that contain double value 42.2223. I want to truncate the last four digits of this particular column. Can any one please provide hive UDF for this particular scenario?
0
votes
0 answers

Hive GenericUDF Java code template required for function accepting a string and returning Map

I am trying to write GenericUDF for Hive. When I add JAR and try to create a temporary function pointing to the class, I get an error, so function creation does not succeed. Can someone provide Java code template for GenericUDF function that accepts…
Dhiraj
  • 3,396
  • 4
  • 41
  • 80
0
votes
0 answers

Hive UDF includes query statement

I'm facing a problem when writing some UDFs, I searched related posts in site but I'm afraid I haven't got any useful ideas yet. The question is: I'm going to execute a SQL statement in UDF, then print query result. Here's my code: public final…
Arvin.Yang
  • 49
  • 3
  • 12
0
votes
0 answers

How to compute percentiles of all columns in the dataframe in Apache Spark without hive udf

I am using Spark 1.6.1 stand alone cluster with 6 workers(8 cores and 5G executor memory per node). My dataframe contains 13 columns and rows. I want to take the 99.5th percentile of each column and I used percentile_approx hive UDAF as suggested in…
Meethu Mathew
  • 431
  • 1
  • 6
  • 15
0
votes
3 answers

How to create view for struct fields in hive

STEP 1: I have written an UDF which will form 2 or more Struct columns like cars, bikes, buses. Also the UDF takes some info from other view called 'details'. cars struct form is: ARRAY> bikes struct form…
Ranjith Sekar
  • 1,892
  • 2
  • 14
  • 18
0
votes
3 answers

Adding JAR in Hive is giving error as "Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist."

I created a UDF and exported the jar as abc.jar. Copied the jar in hdfs at /user/hive/warehouse. Now, I am getting below errors: hive> ADD JAR /user/hive/warehouse/abc.jar; /user/hive/warehouse/abc.jar does not exist Query returned non-zero code: 1,…
earl
  • 738
  • 1
  • 17
  • 38
0
votes
1 answer

is there any request limit for GeoLite2 free databases? (Hive UDF)

I've downloaded free geoiplite databases from link. I am going to use it in hive-geo-ip-udf. Update: SELECT geoip(host,'COUNTRY_CODE','/home/dhruv/GeoLite2-Country.mmdb') from table_name; For 64th entry i am getting FAILED:…
Dhruv Kapatel
  • 873
  • 3
  • 14
  • 27
0
votes
1 answer

Hive ua parser UDF gives IOException

I've useragent strings stored in String format. Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36 I want to extract browser from user agent strings. So i have used ua-parser-java…
Dhruv Kapatel
  • 873
  • 3
  • 14
  • 27
0
votes
1 answer

How to write hive UDFs

I am so confused with how to use UDFS.Is it possible to replace below bash script functionality with UDFs ? #!/bin/bash src_count_q="use db;select count(*) from config_table where table_nm="test_source";" src_count=$(hive -e $src_count_q) …
user2895589
  • 1,010
  • 4
  • 20
  • 33
0
votes
1 answer

Hive UDF to fetch value from distributed cache not working with outer queries

We have written a Hive UDF in Java to fetch value from file added in distributed cache which works perfectly from a select query like : Query 1. select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename; But not…
Som
  • 91
  • 1
  • 10
0
votes
1 answer

Hive UDF Global variable

Can anyone let me know if there is any way of having a global variable in Hive UDF? I am trying to find out a solution of the below problem. Scenario would be as below.I have three types of file A file with 4 columns (Lets assume column names are…
Garfield
  • 396
  • 6
  • 19
0
votes
1 answer

How to read AWS S3 file content using HIVE UDF

I have a text file in Amazon S3 and I want to read the content of file in my Hive UDF. Tried the below code, but not works. UDF Code: package jbr.hiveudf; import java.io.BufferedReader; import java.io.InputStreamReader; import…
Ranjith Sekar
  • 1,892
  • 2
  • 14
  • 18