Questions tagged [udf]

A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. Although the term is widely known in Hadoop components such Hive and Pig, it is also used in other contexts such programming languages and some DBMSs.

From the docs:

Introduction

Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, and JavaScript.

The most extensive support is provided for Java functions. You can customize all parts of the processing including data load/store, column transformation, and aggregation. Java functions are also more efficient because they are implemented in the same language as Pig and because additional interfaces are supported such as the Algebraic Interface and the Accumulator Interface.

Limited support is provided for Python and JavaScript functions. These functions are new, still evolving, additions to the system. Currently only the basic interface is supported; load/store functions are not supported. Furthermore, JavaScript is provided as an experimental feature because it did not go through the same amount of testing as Java or Python. At runtime note that Pig will automatically detect the usage of a scripting UDF in the Pig script and will automatically ship the corresponding scripting jar, either Jython or Rhino, to the backend.

537 questions
0
votes
0 answers

Index Match lookup function in VBA

I am trying to build lookup function in VBA imitating formula: =INDEX(get_column, MATCH(lookup, lookup_column,0),1) I have cooked this: Public Function IndexMatch(get_column As Range, lookup As Range, lookup_column As Range) As Variant …
Przemyslaw Remin
  • 6,276
  • 25
  • 113
  • 191
0
votes
2 answers

Using user-defined functions to dynamically update temp tables and database table entries

What I need to do is change a ActivePath entry in SQL that changes both in value and length to a different path of varying value and length then run this over and over until there are no more entries that match the ActivePath to be changed. This is…
Trevor
  • 141
  • 11
0
votes
2 answers

java.lang.NullPointerException in UDF in PIG

my UDF converts the given input to UPPER case package myudfs; import java.io.IOException; import org.apache.pig.EvalFunc; import org.apache.pig.data.Tuple; public class UPPER extends EvalFunc { public String exec(Tuple input) throws…
pam18
  • 33
  • 2
0
votes
2 answers

Cells using UDF in Excel do not re-calculate

I have written this simple UDF to make a computation in Excel spreadsheets. The code seems to work fine, but only once. However, if I change the values in the table (SharePriceGrowthTable), the results in the cells are not updated. Even when I hit…
Jorge O-L
  • 11
  • 1
0
votes
1 answer

Is it possible to insert value from UDF in Impala

I am trying my hands on UDF in Impala. I have successfully developed a simple UDF. I want to try something different. My idea is to have a UDF which will take 3 different arguments. For e.g. String, double, int and these values which are received…
Mayank
  • 165
  • 1
  • 5
  • 20
0
votes
3 answers

Spreadhseet code reuse

Does anyone know of a way to wrap up a worksheet either as a UDF function? Essentially I'd like to create a worksheet or workbook which carries out certain calculations and then reuse this code in other worksheet or workbooks. Ideally the UDF would…
Tim Galvin
  • 25
  • 7
0
votes
1 answer

Renaming CUBEVALUE function to something shorter?

I've been using a rather long embedded CUBEVALUE() function, which is a pain to work with. It looks something like: =IFERROR(VALUE(CUBEVALUE(arg1;arg2;arg3));CUBEVALUE(arg1;arg2;arg3)) Due to the CUBEVALUE function and its arguments, it's becoming…
Jerros
  • 11
  • 1
  • 8
0
votes
1 answer

Apache Pig ToDate UDF Timestamp format

I am using ToDate UDF in pig for generating a datetime field. Input is in yyyy-MM-dd format. ToDate(sch_trans_dt,'yyyy-MM-dd','Etc/GMT+7') is generating the value with a colon in timestamp field as 2015-11-26T00:00:00.000-07:00 Is there a way to…
Pradeep S
  • 333
  • 3
  • 13
0
votes
1 answer

Getting the address of the cell a function has been typed into

I have been unable to figure this out, and googling has yielded no help! I have a VBA function which I would like to run in individual cells in a spreadsheet. The function needs to know the row the function was just typed into. I currently pass…
fp1991
  • 140
  • 1
  • 10
0
votes
1 answer

Replace at every 5th semicolon

I was wondering if i'ts possible to use hive's regexp_replace at every nth in my case I would like to replace every 5th semicolon with pipe example of column data: test;vid;1;;1.45;id:3;manlyman;2;4;; So there would be 2 replaces in this one.…
Neil
  • 211
  • 1
  • 6
  • 13
0
votes
0 answers

How to replace string building UDF for performance

I have a number of UDF (user defined functions) in SQL2005 that are called from various stored procedures. I did not realise quite how much these slow down retrieval of records, so I'd like to figure out a faster way to create strings. I've looked…
0
votes
1 answer

PIG TRIM and UPPER

I am new to Hadoop programming, looking for help in pig. I have data coming from simple.txt format as , delimeter. I have two use cases. I want to do ltrim(rtrim()) on all the columns and turn to UPPER for selected fields. Here is my script: party =…
LazyBones
  • 113
  • 6
0
votes
1 answer

Flatten tuple of bags and tuples

I have a complex tuple with bags and tuples. How do I flatten it and access the bags? I tried this code: X = ({(a,b)},{(c,d),(e,f)},({(c,d),(e,f)},{g}),({(c,d),(e,f)},{h})) Y = FOREACH X flatten($0); Y = FOEACH Y GENERATE Y.$0; But this doesn't…
nnc
  • 790
  • 2
  • 14
  • 31
0
votes
1 answer

ERROR 1070 Apache Pig, using built-in UDF

This, this, and this, did not solve my problem. They all are making their own UDFs. I want to use a built-in UDF. Any built-in UDF. I get the same or similar error for every UDF I have tried. FOO = LOAD 'filepath/data.csv' USING PigStorage(',') …
wugology
  • 193
  • 1
  • 4
  • 13
0
votes
1 answer

Using AliasableEvalFunc and reading a bag of tuples in Java UDF

I have a pig script which sends a tuple which in turn contains a bag of tuples to a Java UDF. In the UDF, I read the tuple by alias using AliasableEvalFunc. I'm able to read the bag by its alias but not the tuples within the bag by their alias. For…
coder
  • 1,901
  • 5
  • 29
  • 44