0

I am stacking 6 layers of 2D satellite imagery (x data) and attempting to run a CNN over them to classify the landcover (using 8 land cover classes taken from a reformatted USDA Crop Data Layer - y data).

The x data is shaped (2004, 2753, 6) and the y is shaped (2004, 2753, 8) originally and I have used data_x.reshape(-1,2004,2752,6) (same for y) to add an extra dimension as the model.

The 8 categories in the y data-set represent 8 possible land-cover categories in numerical format in 8 bands (i.e. 1st band is corn and represented by 1's for positive and 0 for not corn).

However, when i try to run the model the expected shape does not match what is being passed through to it. I am unsure if I am using the correct model structure or data structure - one idea would be to take the 8 bands of the y dataset

Based on some serious googling i have been learning how to get the data into the correct format with the right number of dimensions etc but feel I am falling at the last hurdle with regards to dimensions (and most likely correct preparation of the x & y data sets).

Below is the CNN model

input_shape=([2004, 2753, 6])

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),strides=(1, 1),activation='relu',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2, 2), padding="same"))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2), padding="same"))
model.add(Dropout(0.25))
#model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(8, activation='softmax'))
#model.add(Flatten())
model.summary()

Model Summary - expecting 500, 687, 8 out at the end

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_54 (Conv2D)           (None, 2002, 2751, 32)    1760      
_________________________________________________________________
max_pooling2d_52 (MaxPooling (None, 1001, 1376, 32)    0         
_________________________________________________________________
conv2d_55 (Conv2D)           (None, 999, 1374, 32)     9248      
_________________________________________________________________
max_pooling2d_53 (MaxPooling (None, 500, 687, 32)      0         
_________________________________________________________________
dropout_57 (Dropout)         (None, 500, 687, 32)      0         
_________________________________________________________________
dense_59 (Dense)             (None, 500, 687, 128)     4224      
_________________________________________________________________
dropout_58 (Dropout)         (None, 500, 687, 128)     0         
_________________________________________________________________
dense_60 (Dense)             (None, 500, 687, 8)       1032      
=================================================================
Total params: 16,264
Trainable params: 16,264
Non-trainable params: 0
_________________________________________________________________

compile

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

model.compile(loss='categorical_crossentropy',
            optimizer='sgd',
            metrics=['accuracy'])

fit - and where i get the error message

history = model.fit(x_train3d, y_train3d,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_split=0.2, validation_data=None)

shape of x_train3D = (1, 2004, 2753, 6) shape of y_train3D = (1, 2004, 2753, 8)

error message

ValueError: Error when checking target: expected dense_58 to have shape (500, 687, 8) but got array with shape (2004, 2753, 8)

Again, I suspect this is down to needing to get the data in the right format both for the input and output but also likely something wrong in the specification of the model. Would appreciate some guidance as i'm new to Keras.

  • What does `y` represent? – gorjan Aug 25 '19 at 20:14
  • Y is the land cover data, it is a raster file which covers 8 categories of land cover (e.g. crops, urban, water etc) taken from the Crop Data Layer. This is the data I want to train the satellite image on. https://developers.google.com/earth-engine/datasets/catalog/USDA_NASS_CDL – Andrew Holden Aug 25 '19 at 20:46

2 Answers2

0

Can you please explain what are you trying to classifiy and what is your expected y_train3D (is it an image or some value for classificaton e.g. 1/2/3.. or x/y/z..etc)

  • I am trying to classify land cover using satellite data (6 band image from sentinel 2) - basically trying to differentiate between corn and soybeans (and 6 other categories of land cover, so 8 in total). y_train3D is a raster file of land cover for the USA - recategorised into 8 categories. so it is a raster image with 8 possible categories (hopefully that helps clarify) https://developers.google.com/earth-engine/datasets/catalog/USDA_NASS_CDL – Andrew Holden Aug 25 '19 at 20:50
0

Just for an update on this - I have managed to clear the error (and now onto a memory error but that's another question).

Solved the issue in 2 ways. 1. Added upsampling to the end of the model to get the data back into the original size - new code in below

model = Sequential()
model.add(Conv2D(32, (3, 3), padding="same", activation="relu",input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2),strides=(2, 2)))

model.add(Conv2D(64, (3, 3), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), padding="same", activation="relu"))

#Upsampling
model.add(UpSampling2D(size=(2,2),interpolation='nearest'))
model.add(UpSampling2D(size=(2,2),interpolation='nearest'))

model.add(Dense(8, activation='relu'))
model.summary()

Give me the below summary


Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 2004, 2752, 32)    1760      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1002, 1376, 32)    0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 1002, 1376, 64)    18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 501, 688, 64)      0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 501, 688, 128)     73856     
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 1002, 1376, 128)   0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 2004, 2752, 128)   0         
_________________________________________________________________
dense_1 (Dense)              (None, 2004, 2752, 8)     1032      
=================================================================
Total params: 95,144
Trainable params: 95,144
Non-trainable params: 0

Part 2 - was ensuring the x and y data arrays were dividable by 4, otherwise this meant as I was losing some of the data through the model through rounding. The below is specific to my code and not robust but worked

if x_train3d.shape[2] % 2:
    x_train3d_adj =  x_train3d_adj[:,:,:-1,:]
    y_train3d_adj =  y_train3d_adj[:,:,:-1,:]

Not a complete solution yet but does get me closer to the end goal