I'm following the autoencoder example @ https://blog.keras.io/building-autoencoders-in-keras.html but utilizing my own data. I get very low GPU utilization and almost no GPU memory utilization.
I'm wondering if it's just having trouble fitting batches onto the GPU. My input data is 5k dimensions, and I'm encoding it to a hidden representation of 250 dimensions. When I vary the batch size on my autoencoder down to one, I get higher GPU usage but it's obviously quite slow (lots of shuffling of data). But when I go higher, I get almost no GPU usage and it's still pretty slow (and slower than CPU accelerated; lowest I've seen on GPU is 3.5k seconds versus 1.8k seconds on CPU). My GPU is a GTX 970, and everything appears to be working just fine with it.
#input and hidden dimension parameters
input_dimensions = Input(shape=(5000,))
encoded_dimensions = 250
#build autoencoder model
encoded = Dense(encoded_dimensions, activation='relu')(input_dimensions)
decoded = Dense(5000, activation='sigmoid')(encoded)
autoencoder = Model(input=input_dimensions, output=decoded)
#build encoder model
encoder = Model(input=input_dimensions, output=encoded)
#build decoder model
encoded_input = Input(shape=(encoded_dimensions,))
decoder_layer = autoencoder.layers[-1]
decoder = Model(input=encoded_input, output=decoder_layer(encoded_input))
autoencoder.compile(optimizer='adadelta', loss = 'mae')
autoencoder.fit(data, data, nb_epoch = 10, batch_size=512, shuffle=True, validation_split=0.1)
Is there a problem with my code that's causing it to run slow, or perhaps some strange configuration issue (my .theanorc, for what it's worth, is configured for GPU and theano reports utilizing the GPU), or is it a function of my data?