Questions tagged [udf]

A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. Although the term is widely known in Hadoop components such Hive and Pig, it is also used in other contexts such programming languages and some DBMSs.

From the docs:

Introduction

Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, and JavaScript.

The most extensive support is provided for Java functions. You can customize all parts of the processing including data load/store, column transformation, and aggregation. Java functions are also more efficient because they are implemented in the same language as Pig and because additional interfaces are supported such as the Algebraic Interface and the Accumulator Interface.

Limited support is provided for Python and JavaScript functions. These functions are new, still evolving, additions to the system. Currently only the basic interface is supported; load/store functions are not supported. Furthermore, JavaScript is provided as an experimental feature because it did not go through the same amount of testing as Java or Python. At runtime note that Pig will automatically detect the usage of a scripting UDF in the Pig script and will automatically ship the corresponding scripting jar, either Jython or Rhino, to the backend.

537 questions
3
votes
0 answers

Destroying Spark UDFs explicitly

I have a long-running SparkContext in which I define a UDF which takes in a large amount of data as its input. I noticed that the memory this UDF uses is not released back up until the SparkContext is terminated. I would like to find a way of…
bbtus
  • 97
  • 1
  • 1
  • 10
3
votes
3 answers

Remove everything but numbers from a cell

I have an excel sheet where i use the follwoing command to get numbers from a cell that contains a form text: =MID(D2;SEARCH("number";D2)+6;13) It searches for the string "number" and gets the next 13 characters that comes after it. But some times…
3
votes
2 answers

Excel VBA UDF autocorrects with wrong case

I have an Excel VBA addin that adds user defined functions (UDFs). These functions all work fine in terms of calculating their results. I enter them into the Function Wizard using MacroOptions. When using them in a worksheet, they autocorrect the…
Greg Lovern
  • 958
  • 4
  • 18
  • 36
3
votes
1 answer

Apache Spark. UDF Column based on another column without passing it's name as argument.

There is DataSet with column firm, I'm adding another column to this DataSet - firm_id here's example: private val firms: mutable.Map[String, Integer] = ... private val firmIdFromCode: (String => Integer) = (code: String) => firms(code) val…
max.kuzmentsov
  • 766
  • 1
  • 10
  • 22
3
votes
1 answer

Schema for type Any is not supported

I'm trying to create a spark UDF to extract a Map of (key, value) pairs from a User defined case class. The scala function seems to work fine, but when I try to convert that to a UDF in spark2.0, I'm running into the " Schema for type Any is not…
Yash
  • 1,080
  • 2
  • 13
  • 24
3
votes
0 answers

No typeTag available Error in scala spark udf

I am getting no typetag found for Seq[String] while compiling the following code val post_event_list_evar_lookup: (String => Seq[String]) = (pel: String) => { pel.split(",").filterNot(_.contains("=")).map(ev => { …
ab_
  • 377
  • 2
  • 5
  • 16
3
votes
1 answer

Apache Spark - UDF doesn't seem to work with spark-submit

I am unable to get UDF to work with spark-submit. I don't have any problem while using spark-shell. Please see below, the Error message, sample code, build.sbt and the command to run the program Will appreciate all the help! - Regards, Venki ERROR…
3
votes
1 answer

Custom excel formula function UDF to count Conditional Formatting

Has anyone run across a function that will actually work with conditional formatting? there are some addons for kutools and albebits but they are not formula based (you have to select everything manually) I have found this, but only works with…
Orin Moyer
  • 509
  • 2
  • 7
  • 13
3
votes
2 answers

Creating user defined function for firebird 2.5 with c++builder 2010

I tried to create a simple user defined function (UDF) for Firebird 2.5 with C++ Builder 2010 but I don't manage to get it to work in Firebird. Creating a DLL project with default setting in C++ Builder 2010. Adding a unit with my example UDF…
3
votes
1 answer

Spark SQL UDF returning scala immutable Map with df.WithColumn()

I have case class case class MyCaseClass(City : String, Extras : Map[String, String]) and user defined function which returns scala.collection.immutable.Map def extrasUdf = spark.udf.register( "extras_udf", (age : Int, name : String) =>…
fpopic
  • 1,756
  • 3
  • 28
  • 40
3
votes
0 answers

Is it possible to use PIG UDF in Spark / Scala ? How?

Is it possible to use PIG UDF in Spark / Scala ? Both supports Java , so I thought we could. Can anybody share trick/ sample ?
Sunil
  • 139
  • 1
  • 2
  • 8
3
votes
1 answer

High Aerospike latency

In the aerospike set we have four bins userId, adId, timestamp, eventype and the primary key is userId:timestamp. Secondary Index is created on userId to get all the records for a particular user and the resulted records are passed to stream udf. On…
annu
  • 75
  • 6
3
votes
5 answers

UDF returns the same value everywhere

I am trying to code in moving average in vba but the following returns the same value everywhere. Function trial1(a As Integer) As Variant Application.Volatile Dim rng As Range Set rng = Range(Cells(ActiveCell.Row, 2),…
3
votes
4 answers

Is it possible to use MySQL UDF on Amazon RDS?

I need to send an HTTP request when a database changes, so I am using the mysqludf extension. It works locally, but how can I get it working on Amazon RDS too? If it's not possible, I need a solution to use a MySQL trigger together with the sys_exec…
Kukka
  • 33
  • 1
  • 6
3
votes
1 answer

Passing & returning a list/array as a parameter/ return type to a UDF in Redshift

I have a bunch of metrics that consume the entire list of float values of a column(think a series of order value on which I a doing some outlier analysis, hence needing the entire array of values) . Can I pass the entire list as a parameter ? It…
ekta
  • 1,560
  • 3
  • 28
  • 57