Questions tagged [udf]

A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. Although the term is widely known in Hadoop components such Hive and Pig, it is also used in other contexts such programming languages and some DBMSs.

From the docs:

Introduction

Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, and JavaScript.

The most extensive support is provided for Java functions. You can customize all parts of the processing including data load/store, column transformation, and aggregation. Java functions are also more efficient because they are implemented in the same language as Pig and because additional interfaces are supported such as the Algebraic Interface and the Accumulator Interface.

Limited support is provided for Python and JavaScript functions. These functions are new, still evolving, additions to the system. Currently only the basic interface is supported; load/store functions are not supported. Furthermore, JavaScript is provided as an experimental feature because it did not go through the same amount of testing as Java or Python. At runtime note that Pig will automatically detect the usage of a scripting UDF in the Pig script and will automatically ship the corresponding scripting jar, either Jython or Rhino, to the backend.

537 questions
0
votes
0 answers

performance improvement of pig script using python udf

Following is the pig(0.15) script used for mapping the inputfile(cdrs as alias) with other file (mastergt as alias) & it is calling a python(2.7.11) udf for mapping the same, which is taking 40mins for say 4.5K records. Can you please suggest…
Amit
  • 89
  • 11
0
votes
1 answer

Spark Scala DF. add a new Column to DF based in processing of some rows of the same column

Dears, I'm New on SparK Scala, and, I have a DF of two columns: "UG" and "Counts" and I like to obtain the Third How was exposed in thsi list. DF: UG, Counts, CUG ( the columns) of 12 4 of 23 4 the 134 …
0
votes
1 answer

Cannot solve these errors Java (Pig UDF) adding libraries, org.apache

package com.mirox.weblog; //error here -The type org.apache.commons.logging.Log cannot be resolved. It is indirectly referenced from required .class files import java.io.IOException; import java.text.SimpleDateFormat; import…
user4019860
0
votes
1 answer

Using VBA to parse and split a string with wildcards?

I've got a sheet that contains item numbers of alphanumeric characters, and a bunch of other information in the row. Sometimes, similar items are combined into one row, and the difference on the item number will be shown with (X/Y) to choose which…
gualdhar
  • 3
  • 4
0
votes
2 answers

Change a tuple using pig

I need to substitute characters of a tuple using Pig UDF. For eg, if i have a line in the file as "hello world, Hello WORLD, hello\WORLD" required to be transformed as "hello_world,hello_world,hello_world". To accomplish this, i tried below…
Revanth
  • 299
  • 5
  • 13
0
votes
0 answers

Product attribute combinations generation in Excel

I have a table which contains 13.931 rows and 2 columns, the first column is SKU's, second is Options (Size, Colour, etc..); SKU Option 0001 Size:S 0001 Size:M 0001 Size:L 0001 Size:XL 0001 Colour:Red 0001 Colour:Blue 0002 Size:S 0002 Size:M 0002…
0
votes
1 answer

Excel WBA UDF to manipulate string doesn't work

My UDF in VBA is complaining about invalid qualifier and I don't understand why. Thank you in advance for your help.
elwindly
  • 45
  • 1
  • 7
0
votes
1 answer

Array defintion in Teradata Aggregate UDF

I'm trying to create a Aggregate UDF function in teradata. As a part of it im trying to decalre an array in intermediate storage. It keeps on throwing the below error when im trying to link it to teradata. Executed as Single statement. Failed [7504…
0
votes
1 answer

SQL - Using logical operators in a UDF case statement

So it took a while for me to figure out how to create my first UDF but after I fixed it, I figured my next one would be a piece of cake. Unfortunately, it hasn't been the case. I'm pulling a field (ORIG_CLAIM, float) and I want to categorize that…
Niq6
  • 7
  • 5
0
votes
2 answers

Python as Hive UDF - Clean Exit on Exception

I'm trying to do a clean exit from the program whenever my python as Hive UDF fails with an exception. Here is an example: SELECT TRANSFORM (id, name) USING 'D:\Python27\python.exe streaming.py' AS (id string, name string, count integer)…
Alekhya Vemavarapu
  • 1,145
  • 1
  • 10
  • 26
0
votes
2 answers

Apache Drill: How to create a user defined function that works similar to JSON.parse available in Javascript?

Sample json document: { "chats": [ { "chatID": 123, "agentComments": "[{\"agentID\":\"agent1\", \"queueID\":\"queue1\", \"comment\":\"Visitor's query not relevant for this queue.\"}, {\"agentID\":\"agent2\", \"queueID\":\"queue2\",…
Zrest
  • 1
  • 1
0
votes
1 answer

UDF inside UDF doesn't work (ByRef argument type missmatch)

I'm building a udf that uses another one inside but I receved a error "ByRef argument missing" in a iteration variable, I have this string "-60981--:54044,-60981--:54044,-60981--:53835,-60981--:53835," and I want to remove duplicated items, so I…
Jey
  • 23
  • 1
  • 1
  • 6
0
votes
0 answers

How to show the parameters of an UDF when typing the function in Excel-Vba?

I have a custom Excel function but when typing it there is no parameter seems. How can I show the parameters of this function like default Excel functions like below? My Excel function: Public Function DiscreteEmp(ByVal rng As Range) c =…
Ali Tor
  • 2,772
  • 2
  • 27
  • 58
0
votes
0 answers

Trying to create R UDF in Vertica

So I have a long function that I run in R every month. My goal is to create a Vertica UDF using vertica's ability to run functions written in R. My hope is that this can then be automated from my companies' data warehouse. I've looked all over…
ben890
  • 1,097
  • 5
  • 25
  • 56
0
votes
0 answers

Is it possible to obtain a notification from MySQL for a running a program?

I'd like to know if it's possible to have, for example, a python program waiting for input and as a database change occurs in a specific table it would notify the python program with the information I want to obtain from that exact table. Thanks in…