Questions tagged [udf]

A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. Although the term is widely known in Hadoop components such Hive and Pig, it is also used in other contexts such programming languages and some DBMSs.

From the docs:

Introduction

Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, and JavaScript.

The most extensive support is provided for Java functions. You can customize all parts of the processing including data load/store, column transformation, and aggregation. Java functions are also more efficient because they are implemented in the same language as Pig and because additional interfaces are supported such as the Algebraic Interface and the Accumulator Interface.

Limited support is provided for Python and JavaScript functions. These functions are new, still evolving, additions to the system. Currently only the basic interface is supported; load/store functions are not supported. Furthermore, JavaScript is provided as an experimental feature because it did not go through the same amount of testing as Java or Python. At runtime note that Pig will automatically detect the usage of a scripting UDF in the Pig script and will automatically ship the corresponding scripting jar, either Jython or Rhino, to the backend.

537 questions
3
votes
1 answer

Is there a way to print or log variable values in a udf function in bigquery?

I have both sql and udf functions that work fine together. I want to compare record's date which is an integer and has yyyymmdd format and today's date in my determineText() function. Is there a method like console.log() or document.write() in…
Ayberk Yavuz
  • 431
  • 1
  • 7
  • 15
3
votes
2 answers

Aerospike - Query with Multiple Filter parameters

I'm trying to query aerospike using multiple filters taking reference from this link. I am able to query aerospike based on the given lua script for 1 filter parameter but stuck up with lua script when have to pass more than 2 filter parameters (for…
3
votes
1 answer

BigQuery UDF reducing all rows

I have the following UDF defined (note my table had an 'Id' and a 'Reading' object with subfield 'RawHex'): // UDF definition function hexdecode(row, emit) { emit({ Id: row.Id, converted: decodeHelper(row.Reading.Raw) }); } // Helper…
GeorgeWilson
  • 562
  • 6
  • 17
3
votes
3 answers

Function to return "too low", "too high" or "OK" for each cell in a range

I want a function to run through a range of cells and if: any are greater than NormalValue then return 'too low', NormalValue is greater than double the maximum value in the range then return 'too high', neither of these are true, then return …
3
votes
2 answers

Is there a data type for time format hh:mm:ss in Hive

I am processing the files that contains the call details of different users. In the data file, there is a field call_duration which contains the value in the format hh:mm:ss. eg: 00:49:39, 00:20:00 etc I would like to calculate the the total call…
smang
  • 95
  • 1
  • 8
3
votes
1 answer

sql create columns from group by collection

I have a table in the following form chain |branch ________|________| a |UK a |US b |ISRAEL b |UK b |FRANCE b |BELGIUM c |NIGERIA and i would like to create a new table in the following format chain …
BigScratch
  • 139
  • 7
3
votes
1 answer

CROSS APPLY WITH UDF

Create function getbyID ( @id int ) Returns table as return( select * from Products where ProductID=@id+10) Function above retruns all records of the Products where the product Id is grater than 10 . When used with CROSS APPLY as…
Java Main
  • 1,521
  • 14
  • 18
3
votes
0 answers

Pig Java UDF: Generating dynamic tuple schema based on input parameters

EDIT: I'm going to try and explain in general what I want to do. 1 row of input looks like: field1, field2, textfield Now textfield is a string entry that is a fixed number of characters. I want to parse this string to extract substrings from…
Kyle
  • 1,430
  • 1
  • 11
  • 34
3
votes
2 answers

Where to contribute Apache Pig UDF?

I have built some UDFs in Apache PIG. I want to make them available as open source. So can someone help me to find out where and how I can publish them.
Subhradip Bose
  • 3,065
  • 2
  • 13
  • 17
2
votes
1 answer

Force UDF in VBA to display a MsgBox when the user enters more than expected arguments?

When the user enters too many arguments for the COUNTBLANK function,the function displays this error message, and returns to edit mode: You've entered too many arguments for this function. How to make any UDF work like that? For example: Function…
2
votes
1 answer

How to create a UDF to find index in an array column

I have a table as below: val question = sqlContext.createDataFrame(Seq((1, Seq("d11","d12","d13")), (2, Seq("d21", "d22", "")))).toDF("Id", "Dates") +---+---------------+ | Id| Dates| +---+---------------+ | 1|[d11, d12, d13]| | 2| …
Regina
  • 57
  • 1
  • 9
2
votes
1 answer

scala class member function as UDF

I am trying to define a member function in a class that would be used as UDF while parsing data from a json file. I am using trait to a define a set of methods and a class to override those methods. trait geouastr { def getGeoLocation(ipAddress:…
Guruprasad
  • 41
  • 6
2
votes
1 answer

Datalab create a BigQuery UDF returning a STRUCT

When using Google Cloud DataLab, I am struggling to create a UDF that returns a STRUCT. As a minimal example, if I do this in a datalab notebook: %bq udf -n demo -l js // Some fn description // @param x FLOAT64 // @returns STRUCT var…
Stewart_R
  • 13,764
  • 11
  • 60
  • 106
2
votes
3 answers

How to enable user defined functions in docker instance of cassandra?

Getting the following error when i try to create a simple subtraction function in cassandra: user defined functions are disabled in cassandra.yaml set enable user defined functions=true I can't figure out how to set it to true. Where do I go to do…
J4ce
  • 185
  • 1
  • 3
  • 11
2
votes
0 answers

xlwings ActiveX component can't create an object when importing UDFs

I have a sheet that is running user defined function from python xlwings. When I try to open the same file from a different machine on the same network I get the error in the title when I click Import UDFs from the xlwings tab. If I create a new…
toniggg
  • 87
  • 8