1

pyspark.sql.utils.AnalysisException: No handler for UDF/UDAF/UDTF 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFHistogramNumeric': java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.udf.generic.SimpleGenericUDAFParameterInfo.<init>([Lorg.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;, boolean, boolean); line 4 pos 29

I get the above error when I try to use histogram_numeric from Hive in Spark SQL.

I've included the relevant hive-exec jar, enabled hive support and I'm starting to wonder if this isn't supported at the moment.

Hive version: 3.1.2 Spark version: 3.0.0

If someone has a simple snippet which works for them when registering Hive UDAFs in Spark 3.0.0 that would be super useful too

Andrew Seymour
  • 250
  • 1
  • 4
  • 18
  • Can you share a sample code snippet for what you are trying to do? – Amit Singh Sep 25 '20 at 17:44
  • I tried running following code snippet on Spark 3.0.0 and it worked without any errors. https://www.codepile.net/pile/OQK1024M. Are you trying to do something similar or is your question about something else entirely, do update your question accordingly. – Amit Singh Sep 25 '20 at 17:59

1 Answers1

1

I tried to register hive uadf via hiveCtx.udf.registerJavaUDAF, but no luck.

hiveCtx.udf.registerJavaUDAF("histogram_numeric", "org.apache.hadoop.hive.ql.udf.generic.GenericUDAFHistogramNumeric")

The hive class which implements "histogram_numeric" was there, but it doesn't conform to spark's JavaUADF interface.

But I found the code with dataframe's selectExpr works. I don't know why.

users_spark_df.selectExpr('histogram_numeric(age, 2)')

Making histogram with Spark DataFrame column

hyim
  • 306
  • 1
  • 3
  • selectExpr gives me the same errors. I did not know Spark had it's own histogram functions though, which I think will serve my purpose – Andrew Seymour Sep 26 '20 at 09:16