-3

Hello can somebody explain step by step what's hapening in following code? Escpecially the part classes and the reshape? tnx

def load_data():
    train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

    test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes

    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
hpaulj
  • 221,503
  • 14
  • 230
  • 353
Gerrit
  • 25
  • 8

1 Answers1

1

Most of the lines just load datasets from the h5 file. The np.array(...) wrapper isn't needed. test_dataset[name][:] is sufficient to load an array.

test_set_y_orig = test_dataset["test_set_y"][:]

test_dataset is the opened file. test_dataset["test_set_y"] is a dataset on that file. The [:] loads the dataset into a numpy array. Look up the h5py docs for more details on load a dataset.

I deduce from

train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))

that the array, as loaded is 1d, with shape (n,), and this reshape is just adding an initial dimension, making it (1,n). I would have coded it as

train_set_y_orig = train_set_y_orig[None,:]

but the result is the same.

There's nothing special about the classes array (though it might well be an array of strings).

hpaulj
  • 221,503
  • 14
  • 230
  • 353