Keras Out-Of-Memory Error When Defining Sequential Model

Question

I am writing an autoencoder for dimensionality reduction. I want to reduce 5,511,677 variables to 800,000 encoded variables. My sequential Keras model is:

autoencoder = tf.keras.models.Sequential([
    tf.keras.layers.InputLayer(input_shape=(5511677, )),
    tf.keras.layers.Dense(800000, activation='relu'),
    tf.keras.layers.Dense(5511677)
])

When I run this code (before I ever interact with the data), Python throws an error:

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[5511677,800000] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator mklcpu [Op:RandomUniform]

I'm running on a server with 128 GB of RAM, but it seems like Python runs out of memory just defining the model, since it is so large.

Am I correct about this? If I am, does anyone know a workaround for this kind of situation, where the model itself is too large to fit in memory?

Note: This is not a duplicate of other questions I found on StackOverflow. In those questions, the full dataset was too large to fit in memory, and the recommendation was to use a generator (which I am). This error is thrown before the program ever interacts with the data.

`tf.keras.layers.Dense(5511677)` is pretty fat. Just loading the weights required for that layer is going to be massive, if I'm not wrong it should be at least the size of `800000 x 5511677`. — M Z, Jul 29 '20 at 14:39
@M-Z That makes sense. However, when I scroll up in the error message, it says that the allocator (mklcpu) ran out of memory trying to allocate 16.04 TiB, and I would be surprised if this model took up that much memory. Also, do you know of any way that I can work around this problem, use the same autoencoder but take less memory? — bfnext, Jul 29 '20 at 14:44
Your model has roughly 2 x 800,000 x 5,511,677 ≈ 9e+12 parameters to store (2 Dense layers), I think that's where the problem lies : it's simply too big. Can you not reduce the input size of your autoencoder? — RandomGuy, Jul 29 '20 at 14:45
make your second dense layer a lot smaller than 800000, and I actually wouldn't be surprised to hear 16.04 TiB — M Z, Jul 29 '20 at 14:45
that's because it would be 17637.3664 GB of floats which is more than your server's 128 GB — vlovero, Jul 29 '20 at 14:46
@RandomGuy we trying to reduce the dimensionality of every mutation that 2000 people have on chromosome 1. In total, the 2000 people have 5,511,677 mutations, and by reducing the dimensionality of this data, we can feed it into other algorithms efficiently. — bfnext, Jul 29 '20 at 14:47
@M Z the program works when the middle layer is of size 600, but this will reduce the dimensionality too much (the loss will be too high) — bfnext, Jul 29 '20 at 14:48
@bfnext two dense layers fully connected is `800000 * 5511677` connections multiply that times `4` bytes for every 32-bit float — vlovero, Jul 29 '20 at 14:49
Then try with different size of your hidden layer : there's still room between a 600 Dense Layer and a 800,000 one ! — RandomGuy, Jul 29 '20 at 14:50
@RandomGuy 700 doesn't work; we tested dimensionality reduction on the first 700 mutations of this dataset and found that we can only do about an 85% dimensionality reduction before loss gets too high — bfnext, Jul 29 '20 at 14:51
Does this answer your question? [How to fix "ResourceExhaustedError: OOM when allocating tensor"](https://stackoverflow.com/questions/59394947/how-to-fix-resourceexhaustederror-oom-when-allocating-tensor) — Nicolas Gervais, Jul 29 '20 at 14:52
@bfnext Please clear my mind : why would you need to reduce all mutations at one? Why can't you work individual per individual, rather than by considering all your 2000 people? — RandomGuy, Jul 29 '20 at 14:55
@NicolasGervais This is a good resource, but I cannot reduce the number of inputs in this case. I don't know much about convolutional layers. Can you use a convolutional layer on input data which is a _5511677 x 1_ vector? — bfnext, Jul 29 '20 at 14:56
That was one of the many suggestions, there are other things you can do that are listed in the linked post. I'm afraid this isn't really the place to ask successive questions in the comments section, so I'm voting to close this. — Nicolas Gervais, Jul 29 '20 at 14:57
@RandomGuy we are trying to do a study which determines which mutations contribute to a particular disease. In order to do this, we need to be able to reduce the dimensionality of each individual's data in the same way (since only then we will have a consistent input into our later machine learning models). — bfnext, Jul 29 '20 at 14:58
@NicolasGervais I have already done everything else that is relevant in that post. — bfnext, Jul 29 '20 at 15:00
You decreased your neurons to 600 and it worked right? So that's the answer. There's no way around the fact that the model doesn't fit in memory. — Nicolas Gervais, Jul 29 '20 at 15:03

Keras Out-Of-Memory Error When Defining Sequential Model

0 Answers0