The problem statement is usage of hive jars in py-spark code. We are following the below set of standard steps
- Create temporary function in pyspark code - spark.sql (" ")
spark.sql("create temporary function public_upper_case_udf as 'com.hive.udf.PrivateUpperCase' using JAR 'gs://hivebqjarbucket/UpperCase.jar'")
- Invoke the temporary function in the spark.sql statements
The issue that we are facing is if the java class in jar file is not declared as public explicitly we are facing with the error during spark.sql invocations of the hive udf
org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 'com.hive.udf.PublicUpperCase'
Java Class Code
class PrivateUpperCase extends UDF {
public String evaluate(String value) {
return value.toUpperCase();
}
}
When I make the class public, the issue seems to get resolved.
The query is if making the class public is only solution or is there any other way around it ?
Any assistance is appreciated.
Note - The Hive Jars cannot be converted to Spark UDFs owing to the complexity.