I Can't save ALS Model

Question

from pyspark.ml.recommendation import ALS, ALSModel
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.mllib.evaluation import RegressionMetrics, RankingMetrics
from pyspark.ml.evaluation import RegressionEvaluator

als = ALS(maxIter=15, 
              regParam=0.08, 
              userCol="ID User", 
              itemCol="ID Film", 
              ratingCol="Rating",
              rank=20,
              numItemBlocks=30,
              numUserBlocks = 30,
              alpha = 0.95,
              nonnegative = True, 
              coldStartStrategy="drop",
             implicitPrefs=False)
model = als.fit(training_dataset)

model.save('model')

everytime i call save method the jupyter notebook gives me similar error

An error occurred while calling o477.save.
: org.apache.spark.SparkException: Job aborted.
    at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:106)

I'm aware of the previous SO question and answer and has tried this:

model.save('model')

.

model.write().save("saved_model")

.

als.write().save("saved_model")

.

als.save('model')

.

import pickle
s = pickle.dumps(als)

.

als_path = "from_C:Folder_to_my_project_root" + "/als"
als.save(als_path)

my question is how to save ALS model so that i can load it without training everytime i run the program

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Jun 07 '22 at 11:46

score 2 · Answer 1 · answered Aug 23 '22 at 07:39

I used to run this problem where i run recommendation for netflix prize dataset with total 100 million records. This is what i did, try to run 50% of the data and slowly add the percentage and see where it breaks. In my case the data slowly add up to 100% of the data. Closing unnecesarry Chrome tab also helps

score 1 · Answer 2 · answered Jun 16 '22 at 03:07

Basically, o477 and oXXX error in general means there's error while doing the jobs. since it seems you're doing a movie recommendation, i assume you use movielens or netflix dataset. it can mean one of these:

File is too big and can't pickle
The model is too complex and your memory runs out

I Can't save ALS Model

2 Answers2