Why my GraphDef implementation exceeds the 2GB limit in TensorFlow?

Question

I am using the tf.estimator API and I have the following model_fn function:

def model_fn(features, labels, mode, params):
    labels = tf.reshape(labels, (-1, 1))

    model = word2vec.create_model(features, params)

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=model)

    loss = sampled_softmax_loss.create_loss(model['softmax_w'],
                                            model['softmax_b'],
                                            model['relu_layer_1'],
                                            labels,
                                            params['softmax_sample'],
                                            params['vocabulary_size'])
    cost = tf.reduce_mean(loss)
    if mode == tf.estimator.ModeKeys.EVAL:
        return tf.estimator.EstimatorSpec(mode=mode, loss=cost)

    optimizer = adam_optimizer.create_optimizer(params['learning_rate'])
    train_operation = optimizer.minimize(cost)
    if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode, loss=cost, train_op=train_operation)

    raise RuntimeError('Not a valid Mode value')

The word2vec.create_model function is given below. The function returns a python dictionary with the interesting nodes of the network (e.g. the embeddings matrix, the softmax weight and bias for training etc.).

def create_model(features, hyper_params):
    model = {}
    vocabulary_size = hyper_params['vocabulary_size']
    hidden_size = hyper_params['hidden_size']
    feature_columns = hyper_params['feature_columns']

    with tf.variable_scope('word2vec'):
        # Creating the Embedding layer
        net = tf.feature_column.input_layer(features, feature_columns)
        # Creating the hidden layer
        net = dense_layer.create_layer(model, net, hidden_size)
        # Creating the SoftMax weight and bias variables to use in the sampled loss function 
        softmax_w = tf.Variable(tf.truncated_normal((vocabulary_size, hidden_size), stddev=0.1), name='softmax_weights')
        softmax_b = tf.Variable(tf.zeros(vocabulary_size), name='softmax_bias')

        model['softmax_w'] = softmax_w
        model['softmax_b'] = softmax_b

    return model

Last but not least, my main function, which in turn I use the tf.app.run(main) command to run:

def main():
    path = os.path.join('data', 'data.csv')
    (train_x, train_y), (test_x, test_y) = prepare_data.load_data(path, path, columns, columns[-1])

    vocabulary_size = len(train_x[columns[0]].unique())

    feature_columns = []
    for key in train_x.keys():
        item_id = tf.feature_column.categorical_column_with_identity(key=key, num_buckets=vocabulary_size)
        feature_columns.append(tf.feature_column.embedding_column(item_id, 512))

    classifier = tf.estimator.Estimator(
        model_fn=model_fn,
        params={
            'feature_columns': feature_columns,
            'vocabulary_size': vocabulary_size,
            'hidden_size': 256,
            'learning_rate': 0.001,
            'softmax_sample': 100,
        })

    print('Training the classifier...')
    classifier.train(input_fn=lambda: prepare_data.train_input_fn(train_x, train_y, 128), steps=2)

    print('Evaluating on test dataset...')
    eval_result = classifier.evaluate(input_fn=lambda: prepare_data.eval_input_fn(test_x, test_y, 128))

    print('Printing results...')
    print(eval_result)

When I run this, I get a ValueError: GraphDef cannot be larger than 2GB. error. Why is that? What can am I doing wrong?

Below is my train_input_fn:

def train_input_fn(features, labels, batch_size):
    def gen():
        for f, l in zip(features, labels):
            yield f, l

    ds = tf.data.Dataset.from_generator(gen, (tf.int64, tf.int64), (tf.TensorShape([None]), tf.TensorShape([None])))
    ds = ds.repeat().batch(batch_size)
    feature, label = ds.make_one_shot_iterator().get_next()

    return {"Input": feature}, label

The dataset is a simple csv like below:

    Input   Label
0   12600   838
1   12600   4558
2   12600   838
3   12600   4558
4   838     12600

You have not shown your `train_input_fn`, most probably from there, if you are using `Dataset.from_tensor_slices`. — Vijay Mariappan, Jun 12 '18 at 09:51
@vijaym Yes I do use `Dataset.from_tensor_slices` in my `train_input_fn`. What is the problem with that? — Dimitris Poulopoulos, Jun 12 '18 at 10:07

score 1 · Accepted Answer · answered Jun 12 '18 at 10:16

1

Dataset.from_tensor_slices adds the whole dataset to the computational graph (see details), so better use Dataset.from_generator. I have shown an example of how to do it using mnist:How to load MNIST via TensorFlow (including download)?

answered Jun 12 '18 at 10:16

Vijay Mariappan

16,921
3
40
59

I think you are right. I created a generator just like in your example but now it doesn't play well with the `tf.feature_column`. I get an `AttributeError: 'Tensor' object has no attribute 'values'` error, because it expects features to be a `A mapping from key to tensors` and not a Tensor. Check how I am creating the `tf.feature_column.embedding_column` in `main` – Dimitris Poulopoulos Jun 12 '18 at 16:29
I really appreciate your help. I do not use TFRecords, what do you mean? Just like in you example with mnist I pass two numpy arrays (features, labels) in my `train_input_fn`, which returns a dataset from generator. – Dimitris Poulopoulos Jun 12 '18 at 16:53
Oh sorry, that was an answer to some other question iam helping. – Vijay Mariappan Jun 12 '18 at 17:02
You may want to debug the input_fn standlone, check: https://stackoverflow.com/questions/50789693/how-do-i-inspect-the-contents-of-tf-estimator-inputs-numpy-input-fn/50790054?noredirect=1#comment88590925_50790054 – Vijay Mariappan Jun 12 '18 at 18:11
My function return type is `, ), types: (tf.int64, tf.int64)>`. I think this is what I want. But how can I read this in the input layer which is `tf.feature_column.input_layer(features, feature_columns)`. I think there's the problem. – Dimitris Poulopoulos Jun 12 '18 at 18:24
The error indicates, that your input function is not returning a feature function, but tensors. Can you share your input_fn and a sample data. In the meanwhile take a look at:https://www.tensorflow.org/api_docs/python/tf/estimator/inputs/numpy_input_fn – Vijay Mariappan Jun 12 '18 at 18:48

Why my GraphDef implementation exceeds the 2GB limit in TensorFlow?

1 Answers1