1

I am trying to make predictions on new data, using a trained and saved model. My new data does not have the same shape as the data used to build the saved model.

I have tried using model.save() as well as model.save_weights(), as I still want to keep the training configurations, but they both produce the same error.

Is there a way to use ones saved model on new data even if the shape is not the same?

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense

model = Sequential([
    Dense(units=11, activation='relu', input_shape = (42,), kernel_regularizer=keras.regularizers.l2(0.001)),
    Dense(units=1, activation='sigmoid')
])

new_model.load_weights('Fin_weights.h5')

y_pred = new_model.predict(X)
ValueError: Error when checking input: expected dense_6_input to have shape (44,) but got array with shape (42,)
Fikile
  • 107
  • 1
  • 2
  • 8
  • 1
    BTW, I've just learned today that you can also provide classes weights to keras training, like this `model.fit(X_train, Y_train, class_weight = {0: 0.3, 1: 0.7})` this is the case when there are two labels, label `0` is `30%` of dataset and `1` is `70%`. Then you don't need to duplicate your dataset entries in order to achieve balanced set of classes, like I said before. – Arty Oct 04 '20 at 01:54
  • Thank you for sharing @Arty! I will definitely keep that in mind. Interesting that you mention weights, I was actually curious as to whether there is a way to view only the inputs with the highest weights in the model. I am interested to see which inputs contribute the most to the output of the model. Is there a way? – Fikile Oct 06 '20 at 20:45
  • Just created code for doing this and posted it into [this old chat](https://chat.stackoverflow.com/transcript/message/50629594#50629594), also description of algorithm is there. – Arty Oct 07 '20 at 04:30
  • Inspired by your interesting new task I implemented very fast generic function to solve task of sorting any array in order of frequencies of its elements, [here is my solution](https://stackoverflow.com/a/64239350/941531). If speed is needed for you new task above (probably not) then you can use this my new generic function too. – Arty Oct 07 '20 at 08:32
  • oh wow, never thought to approach it this way. Thank you. Question: (referring to your first solution) in place of 'a' would I use X_train as my array or the saved weights of the model? I actually tried using X_train and it was a mess. Then again I don't think it makes sense to use X_train since its just my input without weights? – Fikile Oct 07 '20 at 16:25
  • Answered to you [in the chat](https://chat.stackoverflow.com/rooms/222202/discussion-on-answer-by-arty-how-to-predict-on-new-data-using-a-trained-and-save). – Arty Oct 07 '20 at 16:33

1 Answers1

3

No, you have to exactly match same input shape.

Both your model's code (model = Sequential([... lines) should correspond exactly to your saved model and your input data (X in y_pred = new_model.predict(X) line) should be of same shape as in saved model ('Fin_weights.h5').

Only thing you can do is to somehow pad your new data with e.g. zeros. But this can help only if the rest of values correspond to same features or signals.

Lets for example imagine that you were training NN to recognize gray images of shape (2, 3), like below:

1 2 3
4 5 6

Then you trained model and saved it for later use. Afterwards you decided that you want to use your NN on images of smaller or bigger size, like this

1 2
3 4

or this

1  2  3  4
5  6  7  8
9 10 11 12

And you're almost sure that your NN will still give good results on differently shaped input.

Then you just pad first unmatching image with extra zeros on the right like this:

1 2 0
3 4 0

or another way of padding, on the left side

0 1 2
0 3 4

and second image you cut a bit

1  2  3
5  6  7

(or cut it from other sides).

Only then you can apply your NN to this processed input images.

Same in your case, you have to add two zeros. But only in case if it is almost same sequence of encoded input signals or features.

In case if your data for prediction is of wrong size, do this:

y_pred = new_model.predict(
    np.pad(X, ((0, 0), (0, 2)))
)

this pads your data with two zeros on right side, although you might want to pad it on left side ((2, 0) instead of (0, 2)), or on both sides ((1, 1) instead of (0, 2)).

In case if your saved weights are of different shape that model's code do this in code for model (change 42 --> 44):

model = Sequential([
    Dense(units=11, activation='relu', input_shape = (44,), kernel_regularizer=keras.regularizers.l2(0.001)),
    Dense(units=1, activation='sigmoid')
])

You should probably do both things above, to match your saved model/weights.

If NN trained for input of 44 numbers would give totally wrong results for any padding of 42 data then the only way is to re-train your NN for 42 input and save model again.

But you have to take into account the fact that input_shape = (44,) in keras library actually means that the final data X that is fed into model.predict(X) should be of 2-dimensional shape like (10, 44) (where 10 is the number of different objects to be recognized by your NN), keras hides 0-th dimension, it is so-called batch dimension. Batch (0-th) dimension actually can vary, you may feed 5 objects (i.e. array of shape (5, 44)) or 7 (shape (7, 44)) or any other number of object. Batch only means that keras processes several object at one call in parallel, just to be fast/efficient. But each single object is 1-dimensional sub-array of shape (44,). Probably you miss-understood something in how data is fed to network and represented. 44 is not the size of dataset (number of objects), it is number of traits of single object, e.g. if network recognizes/categorizes one human, then 44 can mean 44 characteristics of just one human, like age, gender, height, weight, month of birth, race, color of skin, callories per day, monthly income, monthly spending, salary, etc totalling 44 different fixed characteristics of 1 human object. They probably don't change. But if you got some other data with just 42 or 36 characteristics than you need to place 0 only exactly in positions of characteristics that are missing out of 44, it won't be correct to pad with zeros on right or left, you must place 0s exactly in those positions that are missing out of 44.

But your 44 and 42 and 36 probably mean the number of different input object, each having just 1 characteristics. Imagine a task when you have a DataSet (table) of 50 humans with just two columns of data salary and country then you might want to build NN that guesses country by salary then you'll have input_shape = (1,) (corresponding to 1-D array of 1 number - salary), but definitely not input_shape = (50,) (number of humans in table). input_shape tells the shape of just 1 object, 1 human. 50 is the number of objects (humans), and it is the batch (0-th) dimension in numpy array which is fed for prediction, hence your X array for model.predict(X) is of shape (50, 1), but input_shape = (1,) in the model. Basically keras omits (hides) 0-th batch dimension. If 44 in your case actually meant DataSet size (number of objects) then you've trained NN wrongly and it should be retrained with input_shape = (1,), 44 goes as a batch dimension, and this 44 may vary depending on size of training or testing DataSets.

If you're going to re-train your network, then whole training/evaluation process in simple form is as follows:

  1. Suppose you hav a dataset in CSV file data.csv. For example you have 126 rows and 17 columns there in total.

  2. Read-in your data somehow e.g. by np.loadtxt or by pd.read_csv or by standard python's csv.reader(). Convert data to numbers (floats).

  3. Split your data by rows randomly into two parts training/evaluation approximately in corresponding sizes 90%/10% of rows, e.g. 110 rows for training and 16 for evaluation out of (126 in total).

  4. Decide which columns in your data will be predicted, you can predict any number of columns, lets say we want to predict two columns, 16th and 17th. Now your columns of data are split into two parts X (15 columns, numbered 1-15) and Y (2 columns, numbered 16-17).

  5. In code of your network layers set input_shape = (15,) (15 is number of columns in X) in first layer, and Dense(2) in last layer (2 is number of columns in Y).

  6. Train your network on training dataset using model.fit(X, Y, epochs = 1000, ...) method.

  7. Save trained network to model file through model.save(...) to file like net.h5.

  8. Load your network through model.load(...).

  9. Test network quality through predicted_Y = model.predict(testing_X), compare it to testing_Y, if network model was chosen correctly then testing_Y should be close to predicted_Y, e.g. 80% correct (this ratio is called accuracy).

  10. Why do we split dataset into training/testing parts. Because training stage only sees training dataset sub-part. The task of network training is to remember whole training data well plus generalize prediction by finding some hidden dependencies between X and Y. So if to call model.predict(...) on training data is should give close to 100% accuracy, because network sees all this training data and remembers it. But testing data it doesn't see at all, hence needs to be clever and really predict testing Y by X, hence accuracy in testing is lower e.g. 80%.

  11. If quality of testing results is not great, you have to improve your network architecture and re-run whole training process from start.

  12. If you need to predict partial data, e.g. when you have in your X data only 12 out of total 15 possible columns, then fill-in missing columns values by zeros, e.g. if you're missing column 7 and 11, then insert zeros into 7th and 11th positions. So that total number of columns is 15 again. Your network will support in input for model.predict() only exactly that number of columns that it was trained with, i.e. 15, this number is provided in input_shape = (15,).

Arty
  • 14,883
  • 6
  • 36
  • 69
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/222202/discussion-on-answer-by-arty-how-to-predict-on-new-data-using-a-trained-and-save). – Machavity Sep 28 '20 at 21:08