13

I'm using surprise to perform a cross validation

def cross_v(data, folds=5):
    algorithms = (SVD, KNNBasic, KNNWithMeans, NormalPredictor)
    measures = ['RMSE', 'MAE']

for a in algorithms:
    data.split(folds);
    algo = a();
    algo.fit(data)

I call the function this way

data = Dataset.load_builtin('ml-100k')
multiple_cv(data)

and I get this error

Traceback (most recent call last):
  File "/home/user/PycharmProjects/pac1/prueba.py", line 30, in <module>
    multiple_cv(data)
  File "/home/user/PycharmProjects/pac1/prueba.py", line 19, in multiple_cv
    algo.fit(data)
  File "surprise/prediction_algorithms/matrix_factorization.pyx", line 155, in surprise.prediction_algorithms.matrix_factorization.SVD.fit
  File "surprise/prediction_algorithms/matrix_factorization.pyx", line 204, in surprise.prediction_algorithms.matrix_factorization.SVD.sgd
AttributeError: 'DatasetAutoFolds' object has no attribute 'global_mean'

I missed something??

ace_racer
  • 506
  • 5
  • 10
AFS
  • 1,433
  • 6
  • 28
  • 52

1 Answers1

21

As per the docs, the input to the fit method must be a Trainset, which is different from a Dataset, that you are trying to use. You can split a Dataset to a Trainset (and Testset) using the output of the split method as mentioned here.

In your example,

data = Dataset.load_builtin('ml-100k')
trainset = data.build_full_trainset()

Then, you can use

algo.fit(trainset)

The Trainset and the Testset thus obtained can be used as the inputs for fit and test functions respectively.

Tai
  • 7,684
  • 3
  • 29
  • 49
ace_racer
  • 506
  • 5
  • 10