2

I'm trying to impute 1D array with shape (14599,) with simple imputer with most_frequent strategy but it said it expected 2D array, i already tried reshaping it (-1,1) and (1,-1) but its error ValueError: could not broadcast input array from shape (14599,1) into shape (14599) how can i impute this since reshaping wont solve the problem? i dont understand why it throws error. I already tried to ask it in DS stackexchange and someone answered maybe it's the pandas series but i made the x,y in numpy array then pass it into the parameter for X,y/train,test so i'm not sure

##libraries
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

##codes
plt.close('all')
avo_sales = pd.read_csv('avocados.csv')
avo_sales.rename(columns = {'4046':'small PLU sold',
                            '4225':'large PLU sold',
                            '4770':'xlarge PLU sold'},
                 inplace= True)

avo_sales.columns = avo_sales.columns.str.replace(' ','')

plt.scatter(avo_sales.Date,avo_sales.TotalBags)

x = np.array(avo_sales.drop(['TotalBags','Unnamed:0','year','region','Date'],1))
y = np.array(avo_sales.TotalBags)

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

impC = SimpleImputer(strategy='most_frequent')
X_train[:,8] = impC.fit_transform(X_train[:,8].reshape(-1,1)) <-- error here

imp = SimpleImputer(strategy='median')
X_train[:,1:8] = imp.fit_transform(X_train[:,1:8])

le = LabelEncoder()
X_train[:,8] = le.fit_transform(X_train[:,8])
Dr. H. Lecter
  • 478
  • 2
  • 5
  • 16
random student
  • 683
  • 1
  • 15
  • 33

1 Answers1

2

Change the line:

X_train[:,8] = impC.fit_transform(X_train[:,8].reshape(-1,1))

to

X_train[:,8] = impC.fit_transform(X_train[:,8].reshape(-1,1)).ravel()

and your error will disappear.

It's assigning imputed values back what causes issues on your code.

Sergey Bushmanov
  • 23,310
  • 7
  • 53
  • 72