I am writing an autoencoder for dimensionality reduction. I want to reduce 5,511,677 variables to 800,000 encoded variables. My sequential Keras model is:
autoencoder = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=(5511677, )),
tf.keras.layers.Dense(800000, activation='relu'),
tf.keras.layers.Dense(5511677)
])
When I run this code (before I ever interact with the data), Python throws an error:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[5511677,800000] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator mklcpu [Op:RandomUniform]
I'm running on a server with 128 GB of RAM, but it seems like Python runs out of memory just defining the model, since it is so large.
Am I correct about this? If I am, does anyone know a workaround for this kind of situation, where the model itself is too large to fit in memory?
Note: This is not a duplicate of other questions I found on StackOverflow. In those questions, the full dataset was too large to fit in memory, and the recommendation was to use a generator (which I am). This error is thrown before the program ever interacts with the data.