Deploy Keras model on Spark

Question

I have a trained keras model.

I have a large updating dataset I want to get predictions on. Meaning to run my spark job every 2 hours or so.

What is the way to implement this? MlLib does not support efficientNet.

When searching online I saw this kind of implementation using sparkdl, but it does not support efficentNet as modelName parameter.

featurizer = DeepImageFeaturizer(inputCol="image", outputCol="features", modelName="InceptionV3")
rf = RandomForestClassifier(labelCol="label", featuresCol="features")

My naive approach would be

import efficientnet.keras as efn 
model = efn.EfficientNetB0(weights='imagenet')
from sparkdl import readImages

image_df = readImages("flower_photos/sample/")
image_df.withcolumn("modelTags", efficient_net_udf($"image".data))

and creating a UDF that calls model.predict...

Another method I saw is

from keras.preprocessing.image import img_to_array, load_img
import numpy as np
import os
from pyspark.sql.types import StringType
from sparkdl import KerasImageFileTransformer

import efficientnet.keras as efn 
model = efn.EfficientNetB0(weights='imagenet')
model.save("kerasModel.h5")

def loadAndPreprocessKeras(uri):
  image = img_to_array(load_img(uri, target_size=(299, 299)))
  image = np.expand_dims(image, axis=0)
  return image 

transformer = KerasImageFileTransformer(inputCol="uri", outputCol="predictions",
                                        modelFile='path/kerasModel.h5', 
                                        imageLoader=loadAndPreprocessKeras,
                                        outputMode="vector")

files = [os.path.abspath(os.path.join(dirpath, f)) for f in os.listdir("/data/myimages") if f.endswith('.jpg')]
uri_df = sqlContext.createDataFrame(files, StringType()).toDF("uri")

keras_pred_df = transformer.transform(uri_df)

What is the correct (and working) way to approach this?

In your case, I´d try to use a Spark native DL framework like BigDL to import the model: https://github.com/intel-analytics/BigDL/blob/master/docs/docs/ProgrammingGuide/keras-support.md or Deeplearning4J — Emiliano Martinez, Oct 03 '19 at 19:43
I didn´t try the second option. I´m just saying that you can test another framework, what is the error with the second?, the stack trace? — Emiliano Martinez, Oct 03 '19 at 21:11

Deploy Keras model on Spark

0 Answers0