Questions tagged [user-defined-functions]

A function provided by the user of a program or an environment most often for spreadsheet type applications or database applications. Use [custom-functions-excel] for Excel and [custom-function] for Google sheets. Specify a programming language tag as well: [google-apps-script], [javascript], [sql], [tsql], etc. as well as a tag for the application: [excel], [google-spreadsheet], [sql-server] etc.

In the context of a programming language or an environment, a User Defined Function (UDF) is a function that is created by a user to perform a specific task (as opposed to a function that is intrinsic to the environment, built into the programming language or environment).

Spreadsheet applications like Excel and Google Sheets calls these "custom functions".

Microsoft also uses the term User Defined Functions with . The tag may also be applicable. See What is the need for user-defined functions in SQL Server?

Use:

4875 questions
1
vote
1 answer

How can I install flashtext on every executor?

I am using the flashtext library in a couple of UDFs. It works when I run it locally in Client mode, but once I try to run it in the Cloudera Workbench with several executors, I get an ModuleNotFoundError. After some research I found that it is…
1
vote
1 answer

An exception was thrown from a UDF: 'SyntaxError: unexpected EOF while parsing'

import ast import json from pyspark.sql import functions as F from pyspark.sql.types import * schema = StructType([StructField('in_network', ArrayType(StructType([StructField('billing_code', StringType(), True), StructField('billing_code_type',…
1
vote
0 answers

How to use Knn Imputer in Pyspark

I want to use Knn Imputer in my Spark data Frame but it doesn't work well as I expected. I have data that contain None data like…
KJH
  • 23
  • 3
1
vote
2 answers

UDF function for a double datatype in PySpark

I am trying to create a column using UDF function in PySpark. The code I tried looks like this: # The function checks year and adds a multiplied value_column to the final column def new_column(row, year): if year == "2020": return row *…
1
vote
2 answers

How to get a sequence string per row from 2 columns in PySpark?

I have the following data structure: The columns "s" and "d" are indicating the transition of the object in column "x". What I want to do is get a transition string per object present in column "x". E.g. with a "new" column as follows: Is there a…
1
vote
2 answers

Count unique values for every row in PySpark

I have PySpark DataFrame: from pyspark.sql.types import * schema = StructType([ StructField("col1", StringType()), StructField("col2", StringType()), StructField("col3", StringType()), StructField("col4", StringType()), ]) data = [("aaa",…
1
vote
1 answer

Converting apply from pandas to a pandas_udf

How can I convert the following sample code to a pandas_udf: def calculate_courses_final_df(this_row): some code that applies to each row of the data df_contracts_courses.apply(lambda x: calculate_courses_final_df(x),…
1
vote
1 answer

Snowflake udf logic

I have created some SQL UDF there return a table which work as it should. Now I would like to add some logic so based on the input parameters of the udf, different queries should be used e.g: if input_parameter = A then SELECT * FROM table where…
jvels
  • 319
  • 2
  • 16
1
vote
1 answer

User-Function Runs But Doesn't display value

I've written a VBA UDF to calculate an AQL Sample Size by the Inspection Level (IL Variable in my code), and Lot size (Batch Variable in my code). In an Access userform, I have the values for my two variables displayed in a text box. I have a third…
Gary Nolan
  • 111
  • 10
1
vote
1 answer

Spark Scala UDF to count number of array elements contained in another string column

I have a spark dataframe df with 2 columns, say A and B, where A is array of string type and B is a string. For each row, I am trying to count how many elements in A are contained in B. The UDF I have written is as follows. I thought it should be…
Jin
  • 1,203
  • 4
  • 20
  • 44
1
vote
2 answers

Define and return two dimensional string array from function in C

I want to use a two-dimensional string array (new_sentence) that I produced in the function named parse() in the main function (or any other function). The code I wrote is below. strtok() and strcpy() functions work fine. I can access all elements…
1
vote
1 answer

How to call another object within Pyspark UDF function

I have a class Hello with a few methods I would like to create a hello object within a UDF pyspark function, such as: def foo_generation(query_params): query_obj = Hello() foo = query_obj.hello_method(query_params) return…
Alex
  • 1,447
  • 7
  • 23
  • 48
1
vote
0 answers

Pass Strings into User Defined Function with Modeling

Trying to pass the name of a variable into a user defined regression function. Having a heck of a time. Here is the formula, run prior to converting it into a function: # Run cox proportional hazards regressions survey::svycoxph(Surv( time =…
1
vote
1 answer

PySpark column is appending udf's argument value

I have written a small program, it is working, but it is adding argument value into column which I do not need. Input: Expected: Image above one upper into upper case Getting: Code: #!/usr/bin/env python import sys import logging from…
1
vote
0 answers

generate gender variables in spark using UDF

I try to generate a variables (M or F) with udf and I have the following code: @udf def gender(): return np.random.choice(["F","M"],p=[0.5,0.5]) second_df.select('ID', gender()) however, it does not work. Anyone have any idea what I am doing…