I am trying to run a spark LightGBMRegressor
conncting to databricks with databricks-connect
using pycharm
.
when trying to "fit" my data I get an error NoClassDefFoundError: spray/json/JsonWriter
.
code i am trying to run:
if "DATABRICKS_RUNTIME_VERSION" not in os.environ:
from pyspark.sql import SparkSession
from pyspark.dbutils import DBUtils
spark = SparkSession.builder \
.config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.9.5") \
.config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven") \
.getOrCreate()
from pyspark.ml.evaluation import RegressionEvaluator
train_data = featurizer.transform(x_trn)[experiment.config.target_col, 'features']
test_data = featurizer.transform(x_tst)[experiment.config.target_col, 'features']
train_data.groupBy(experiment.config.target_col)
model = splightgbm.LightGBMRegressor(
numIterations=500,
learningRate=0.05,
featuresCol="features", labelCol=experiment.config.target_col
)
model.fit(
train_data
)
this is the Traceback:
py4j.protocol.Py4JJavaError: An error occurred while calling o1676.fit.
: java.lang.NoClassDefFoundError: spray/json/JsonWriter
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:242)
at org.apache.spark.sql.util.SparkServiceObjectInputStream.readResolveClassDescriptor(SparkServiceObjectInputStream.scala:60)
at org.apache.spark.sql.util.SparkServiceObjectInputStream.readClassDescriptor(SparkServiceObjectInputStream.scala:55)``