0

I am new with sklearn.
My objective is to estimate the score of a dataset using cross_val_score with BayesianRidge estimator. It should be implemented using an unsupervised learning. The code below is taken from sklearn except that the target variable, y, is excluded.
The data is taken from sklearn.datasets import fetch_california_housing.

estimator = BayesianRidge()
score_full_data = pd.DataFrame(cross_val_score(br_estimator, X=X, y=None, scoring='neg_mean_squared_error', cv=5), columns=['Data'])

I got a TypeError: fit() missing 1 required positional argument: 'y'.
The expected result is:

Data
0   -0.408433
1   -0.636009
2   -0.614910
3   -1.089616
4   -0.407541

How is the correct way to do it?

k.ko3n
  • 954
  • 8
  • 26

1 Answers1

1

It's not working because of the fact that you are using a supervised learning classifier and trying to use it as an unsupervised classifier. You can't simply expect the underlying implementation of BayesianRidge classifier to change just because you are not supplying the target variable, i.e. y. If you check the documentation here, you will see that y is not an optional argument. Image from the link for reference:

enter image description here Image Source

Secondly, this is not an unsupervised learning problem in the first place. This dataset you mentioned is for regression. So it doesn't make sense to use unsupervised learning here.

Gambit1614
  • 8,547
  • 1
  • 25
  • 51
  • Thanks, @Mohammed Kashif. The original problem was because I don't have the ```variable vector```, ```y.``` I only have the raw data, ```X```. Is it correct if I build the ```y``` from ```X``` using ```train_test_split()``` function? – k.ko3n Jul 21 '19 at 08:14
  • @k.ko3n No that would still be incorrect. You cannot build `y` from `X`. Are you still using the housing dataset you mentioned in the question above ? – Gambit1614 Jul 21 '19 at 10:58
  • I used the example and data in the question for learning only. For my actual task, I am using my data, but only ```X``` is available. I don't know how or where to get the ```y```. – k.ko3n Jul 21 '19 at 17:52
  • @k.ko3n Can you give more details about the dataset ? Not the values, but only information about columns and what problem you are trying to solve. From what I understand, you need to use an unsupervised classifier in this case, but I need more clarification in order to help you out further. It would be better, if you could post these details as a seperate question and I can help you out there. – Gambit1614 Jul 21 '19 at 20:10
  • I have rephrased the problem in a new post. https://stackoverflow.com/questions/57154209/implementation-of-sklearn-impute-iterativeimputer. Thanks for your attention. – k.ko3n Jul 22 '19 at 21:53