1

I want to deploy a classifer I trained using mllib over http service. So, I am wondering whether if I load the serialized object in my code and send it some data is it necessary to run a local version of spark as well. And if so is there any effect of having multiple instances of a service running on a same machine (do I have to configure each spark separatelly).

Basically I want to avoid having a spark job startup each time a request for a new classification is made, and do not have a spark streaming setup.

cheers

zero323
  • 322,348
  • 103
  • 959
  • 935
ilijaluve
  • 1,050
  • 2
  • 10
  • 24
  • So what is the question here? Some (not all) MLLib models are represented as local objects so can be used without running Spark but with ongoing migrations from MLlib to ML it is unlikely to be a future proof approach. – zero323 Apr 25 '16 at 11:05
  • Sure, but to run any data processing for input into a ML model I think I have to to use spark to create dataframes or RDDs. This requires use of a SparkContext. So the question is what happens with asking spark to preprocess the data on a large scale with a spark started in "memory" of a jvm process (not a standalone spark on one machine). – ilijaluve Apr 25 '16 at 12:42
  • ML yes, MLlib not all. Many models (like regression models) can work without running context. – zero323 Apr 25 '16 at 12:44

0 Answers0