I am trying to parallelize lime framework on databricks , but not able to do so. I dont know how , how can i send every observation on different workers, if anyone could help.
Ex plainer is coming from the lime framework.
Code below :
import lime
import lime.lime_tabular
explainer=
lime.lime_tabular.LimeTabularExplainer(X_train,feature_names=train_columns,
class_names=['look_forward_Repatha'],verbose=True,
mode='regression')
--------------------------------
def calculate_in_parallel(line):
test_nparray = np.array(line)
exp = explainer.explain_instance(test_nparray,xgb_model.predict,
num_features=30)
return exp.as_list()
test_rdd = sc.parallelize(df_pred_X_test_skew_nohighcoll.values)
test_rdd = test_rdd.map(calculate_in_parallel)
test_rdd = test_rdd.collect()