16

Since I need to write some preprocesses for the data before using Tensorflow to train models, some modifications on the tensor is needed. However, I have no idea about how to modify the values in tensor like the way using numpy.

The best way of doing so is that it is able to modify tensor directly. Yet, it seems not possible in the current version of Tensorflow. An alternative way is changing tensor to ndarray for the process, and then use tf.convert_to_tensor to change back.

The key is how to change tensor to ndarray.
1) tf.contrib.util.make_ndarray(tensor): https://www.tensorflow.org/versions/r0.8/api_docs/python/contrib.util.html#make_ndarray
It seems the easiest way as per the document, yet I cannot find this function in the current version of the Tensorflow. Second, the input of it is TensorProto rather than tensor.
2) Use a.eval() to copy a to another ndarray
Yet, it works only at using tf.InteractiveSession() in notebook.

A simple case with codes shows below. The purpose of this code is making that the tfc has the same output as npc after the process.

HINT
You should treat that tfc and npc are independent to each other. This meets the situation that at first the retrieved training data is in tensor format with tf.placeholder().


Source code

import numpy as np
import tensorflow as tf
tf.InteractiveSession()

tfc = tf.constant([[1.,2.],[3.,4.]])
npc = np.array([[1.,2.],[3.,4.]])
row = np.array([[.1,.2]])
print('tfc:\n', tfc.eval())
print('npc:\n', npc)
for i in range(2):
    for j in range(2):
        npc[i,j] += row[0,j]

print('modified tfc:\n', tfc.eval())
print('modified npc:\n', npc)

Output:

tfc:
[[ 1. 2.]
[ 3. 4.]]
npc:
[[ 1. 2.]
[ 3. 4.]]
modified tfc:
[[ 1. 2.]
[ 3. 4.]]
modified npc:
[[ 1.1 2.2]
[ 3.1 4.2]]

user3030046
  • 279
  • 1
  • 4
  • 8

2 Answers2

10

Use assign and eval (or sess.run) the assign:

import numpy as np
import tensorflow as tf

npc = np.array([[1.,2.],[3.,4.]])
tfc = tf.Variable(npc) # Use variable 

row = np.array([[.1,.2]])

with tf.Session() as sess:   
    tf.initialize_all_variables().run() # need to initialize all variables

    print('tfc:\n', tfc.eval())
    print('npc:\n', npc)
    for i in range(2):
        for j in range(2):
            npc[i,j] += row[0,j]
    tfc.assign(npc).eval() # assign_sub/assign_add is also available.
    print('modified tfc:\n', tfc.eval())
    print('modified npc:\n', npc)

It outputs:

tfc:
 [[ 1.  2.]
 [ 3.  4.]]
npc:
 [[ 1.  2.]
 [ 3.  4.]]
modified tfc:
 [[ 1.1  2.2]
 [ 3.1  4.2]]
modified npc:
 [[ 1.1  2.2]
 [ 3.1  4.2]]
Sung Kim
  • 8,417
  • 9
  • 34
  • 42
  • 1
    Thanks! I think your thought is supposing that `tfc` shares the same value as `npc` at the beginning so that you can first process on `npc` and then `assign` it to `tfc` for the same output. This is however not the case. In the practical case that the only data you have is `tfc`, suggesting you need to treat the `npc` does not exist at the beginning. So the key should be at how to process the data in `tfc`. – user3030046 May 07 '16 at 00:20
  • @user3030046 Depends on the operations. If they are simple add/sub, use assign_sub/assign_add. Others, we have many other ways. Do you want to update elements in a tf.Tensor? If you give me a use case, I'll see what I can do. – Sung Kim May 07 '16 at 00:53
  • My case is that I am trying to implement the CBOW (the last cell in [here](https://github.com/tensorflow/tensorflow/blob/e39d8feebb9666a331345cd8d960f5ade4652bba/tensorflow/examples/udacity/5_word2vec.ipynb). And it needs to sum over the vectors of all nearby skips for prediction. Specifically, it is like using a row-wise sum over a 10-by-3 matrix and outputs a 10-by-1 vector. This convolutional operation will repeat on a 10-by-100 matrix and it will yield an output matrix of the identical size. Do you have any ideas? – user3030046 May 07 '16 at 02:57
  • @user3030046 Could you add another question with some sample code? It would be better to understand your question and answer it. Thanks! – Sung Kim May 07 '16 at 04:57
0

I struggled with this for a while. The answer given will add assign operations to the graph (and thus needlessly increase the size of the .meta if you subsequently save a checkpoint). A better solution is to use tf.keras.backend.set_value. One could emulate that with raw tensorflow by doing:

    for x, value in zip(tf.global_variables(), values_npfmt):
      if hasattr(x, '_assign_placeholder'):
        assign_placeholder = x._assign_placeholder
        assign_op = x._assign_op
      else:
        assign_placeholder = array_ops.placeholder(tf_dtype, shape=value.shape)
        assign_op = x.assign(assign_placeholder)
        x._assign_placeholder = assign_placeholder
        x._assign_op = assign_op
      get_session().run(assign_op, feed_dict={assign_placeholder: value})
user5931
  • 13
  • 3