I have made a model for predicting the yield of a crop on the basis of several features. It initially had 8 columns and after one hot encoding 4 of its columns it has a total of 813 columns.
I saved the model and the encoder and I use the following block of code for predicting on new data:
ohe = {}
pickle_file = open("ohe_yield_prediction.pkl","rb")
ohe = pickle.load(pickle_file)
pickle_file = open("yield_prediction.pkl","rb")
model = pickle.load(pickle_file)
list1 = ['Andaman and Nicobar Islands', 'NICOBARS', 2000, 'Kharif ', 'Arecanut', 1254.0, 2000.0, 1208.4]
a = [0,1,3,4]
b = [2, 5, 6, 7]
list2 = [list1[i] for i in a]
list3 = np.array(list1[j] for j in b)
input = np.array(list2).reshape(-1,4)
encoded_input = ohe.transform(input)
final_array = input + encoded_input
This returns the error :
ValueError Traceback (most recent call last)
<ipython-input-4-497dd0be9763> in <module>()
13 input = np.array(list2).reshape(-1,4)
14 encoded_input = ohe.transform(input)
---> 15 final_array = input + encoded_input
3 frames
<__array_function__ internals> in broadcast_to(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/numpy/lib/stride_tricks.py in _broadcast_to(array, shape, subok, readonly)
123 it = np.nditer(
124 (array,), flags=['multi_index', 'refs_ok', 'zerosize_ok'] + extras,
--> 125 op_flags=['readonly'], itershape=shape, order='C')
126 with it:
127 # never really has writebackifcopy semantics
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (1,4) and requested shape (1,809)
How to encode those specific string columns and make a final input array to be fed into the model?