25

OK, I'd like to do a 1-dimensional convolution of time series data in Tensorflow. This is apparently supported using tf.nn.conv2d, according to these tickets, and the manual. the only requirement is to set strides=[1,1,1,1]. Sounds simple!

However, I cannot work out how to do this in even a very minimal test case. What am I doing wrong?

Let's set this up.

import tensorflow as tf
import numpy as np
print(tf.__version__)
>>> 0.9.0

OK, now generate a basic convolution test on two small arrays. I will make it easy by using a batch size of 1, and since time series are 1-dimensional, I will have an "image height" of 1. And since it's a univariate time series, clearly the number of "channels" is also 1, so this will be simple, right?

g = tf.Graph()
with g.as_default():
    # data shape is "[batch, in_height, in_width, in_channels]",
    x = tf.Variable(np.array([0.0, 0.0, 0.0, 0.0, 1.0]).reshape(1,1,-1,1), name="x")
    # filter shape is "[filter_height, filter_width, in_channels, out_channels]"
    phi = tf.Variable(np.array([0.0, 0.5, 1.0]).reshape(1,-1,1,1), name="phi")
    conv = tf.nn.conv2d(
        phi,
        x,
        strides=[1, 1, 1, 1],
        padding="SAME",
        name="conv")

BOOM. Error.

ValueError: Dimensions 1 and 5 are not compatible

OK, For a start, I don't understand how this should happen with any dimension, since I've specified that I'm padding the arguments in the convolution OP.

but fine, maybe there are limits to that. I must have got the documentation confused and set up this convolution on the wrong axes of the tensor. I'll try all possible permutations:

for i in range(4):
    for j in range(4):
        shape1 = [1,1,1,1]
        shape1[i] = -1
        shape2 = [1,1,1,1]
        shape2[j] = -1
        x_array = np.array([0.0, 0.0, 0.0, 0.0, 1.0]).reshape(*shape1)
        phi_array = np.array([0.0, 0.5, 1.0]).reshape(*shape2)
        try:
            g = tf.Graph()
            with g.as_default():
                x = tf.Variable(x_array, name="x")
                phi = tf.Variable(phi_array, name="phi")
                conv = tf.nn.conv2d(
                    x,
                    phi,
                    strides=[1, 1, 1, 1],
                    padding="SAME",
                    name="conv")
                init_op = tf.initialize_all_variables()
            sess = tf.Session(graph=g)
            sess.run(init_op)
            print("SUCCEEDED!", x_array.shape, phi_array.shape, conv.eval(session=sess))
            sess.close()
        except Exception as e:
            print("FAILED!", x_array.shape, phi_array.shape, type(e), e.args or e._message)

Result:

FAILED! (5, 1, 1, 1) (3, 1, 1, 1) <class 'ValueError'> ('Filter must not be larger than the input: Filter: (3, 1) Input: (1, 1)',)
FAILED! (5, 1, 1, 1) (1, 3, 1, 1) <class 'ValueError'> ('Filter must not be larger than the input: Filter: (1, 3) Input: (1, 1)',)
FAILED! (5, 1, 1, 1) (1, 1, 3, 1) <class 'ValueError'> ('Dimensions 1 and 3 are not compatible',)
FAILED! (5, 1, 1, 1) (1, 1, 1, 3) <class 'tensorflow.python.framework.errors.InvalidArgumentError'> No OpKernel was registered to support Op 'Conv2D' with these attrs
     [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]]
FAILED! (1, 5, 1, 1) (3, 1, 1, 1) <class 'tensorflow.python.framework.errors.InvalidArgumentError'> No OpKernel was registered to support Op 'Conv2D' with these attrs
     [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]]
FAILED! (1, 5, 1, 1) (1, 3, 1, 1) <class 'ValueError'> ('Filter must not be larger than the input: Filter: (1, 3) Input: (5, 1)',)
FAILED! (1, 5, 1, 1) (1, 1, 3, 1) <class 'ValueError'> ('Dimensions 1 and 3 are not compatible',)
FAILED! (1, 5, 1, 1) (1, 1, 1, 3) <class 'tensorflow.python.framework.errors.InvalidArgumentError'> No OpKernel was registered to support Op 'Conv2D' with these attrs
     [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]]
FAILED! (1, 1, 5, 1) (3, 1, 1, 1) <class 'ValueError'> ('Filter must not be larger than the input: Filter: (3, 1) Input: (1, 5)',)
FAILED! (1, 1, 5, 1) (1, 3, 1, 1) <class 'tensorflow.python.framework.errors.InvalidArgumentError'> No OpKernel was registered to support Op 'Conv2D' with these attrs
     [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]]
FAILED! (1, 1, 5, 1) (1, 1, 3, 1) <class 'ValueError'> ('Dimensions 1 and 3 are not compatible',)
FAILED! (1, 1, 5, 1) (1, 1, 1, 3) <class 'tensorflow.python.framework.errors.InvalidArgumentError'> No OpKernel was registered to support Op 'Conv2D' with these attrs
     [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]]
FAILED! (1, 1, 1, 5) (3, 1, 1, 1) <class 'ValueError'> ('Dimensions 5 and 1 are not compatible',)
FAILED! (1, 1, 1, 5) (1, 3, 1, 1) <class 'ValueError'> ('Dimensions 5 and 1 are not compatible',)
FAILED! (1, 1, 1, 5) (1, 1, 3, 1) <class 'ValueError'> ('Dimensions 5 and 3 are not compatible',)
FAILED! (1, 1, 1, 5) (1, 1, 1, 3) <class 'ValueError'> ('Dimensions 5 and 1 are not compatible',)

Hmm. OK, it looks like there are two problems now. Firstly, the ValueError is about applying the filter along the wrong axis, I guess, although there are two forms.

But then the axes along which I can apply the filter are confusing too - notice that it actually constructs the graph with input shape (5, 1, 1, 1) and filter shape (1, 1, 1, 3). AFAICT from the documentation, this should be a filter that looks at on example from the batch, one "pixel" and one "channel" and outputs 3 "channels". Why does that one work, then, when others do not?

Anyway, sometimes it does not fail while constructing the graph. Sometime it constructs the graph; then we get the tensorflow.python.framework.errors.InvalidArgumentError. From some confusing github tickets I gather this is probably due to the fact that I'm running on CPU instead of GPU, or vice versa the fact that the convolution Op is only defined for 32 bit floats, not 64 bit floats. If anyone could throw some light on which axes I should be aligning what on, in order to convolve a time series with a kernel, I'd be very grateful.

dan mackinlay
  • 945
  • 1
  • 9
  • 14

3 Answers3

36

I am sorry to say that, but your first code was almost right. You just inverted x and phi in tf.nn.conv2d:

g = tf.Graph()
with g.as_default():
    # data shape is "[batch, in_height, in_width, in_channels]",
    x = tf.Variable(np.array([0.0, 0.0, 0.0, 0.0, 1.0]).reshape(1, 1, 5, 1), name="x")
    # filter shape is "[filter_height, filter_width, in_channels, out_channels]"
    phi = tf.Variable(np.array([0.0, 0.5, 1.0]).reshape(1, 3, 1, 1), name="phi")
    conv = tf.nn.conv2d(
        x,
        phi,
        strides=[1, 1, 1, 1],
        padding="SAME",
        name="conv")

Update: TensorFlow now supports 1D convolution since version r0.11, using tf.nn.conv1d. I previously made a guide to use them in the stackoverflow documentation (now extinct) that I'm pasting here:


Guide to 1D convolution

Consider a basic example with an input of length 10, and dimension 16. The batch size is 32. We therefore have a placeholder with input shape [batch_size, 10, 16].

batch_size = 32
x = tf.placeholder(tf.float32, [batch_size, 10, 16])

We then create a filter with width 3, and we take 16 channels as input, and output also 16 channels.

filter = tf.zeros([3, 16, 16])  # these should be real values, not 0

Finally we apply tf.nn.conv1d with a stride and a padding: - stride: integer s - padding: this works like in 2D, you can choose between SAME and VALID. SAME will output the same input length, while VALID will not add zero padding.

For our example we take a stride of 2, and a valid padding.

output = tf.nn.conv1d(x, filter, stride=2, padding="VALID")

The output shape should be [batch_size, 4, 16].
With padding="SAME", we would have had an output shape of [batch_size, 5, 16].

Olivier Moindrot
  • 27,908
  • 11
  • 92
  • 91
  • *facepalm* Thanks! Good catch! That solves my immediate problem. Aside: I think the behavior of conv2d with padding='SAME' is weird - convolution in usual signal processing is a function on two vectors from the *same space*, so this asymmetry in kernel length is vexing. That and the fact that they reverse one argument leads to unnecessary confusion. Anyway, that's not the current issue... – dan mackinlay Jul 01 '16 at 01:31
  • why are you reshaping to 1,1,5,1 and then your phi variables is 1,3,1,1 and not say, 1,1,3,1. I would have expected to have the reshape match in the same dimensions. – Charlie Parker Aug 09 '16 at 16:06
  • 1
    Read the meanings of each dimension in the code. The 2nd and 3rd dimension of the input are its height and width. The 1st and 2nd dimension of the filter is its height and width. Basically you have to think like you are working with images of height 1. – Olivier Moindrot Aug 09 '16 at 16:15
  • @OlivierMoindrot I had already read the comments. But you have a filter of height 3 and width 1. Why are you doing that? Are you applying a convolution of 1 single number across a 1D signal with 5 numbers? Does the height 3 act as if you had 3 different weight filters each of size 1? What if I wanted my filters to be of size 2 and wanted 12 of those filters. How would you do that? – Charlie Parker Aug 09 '16 at 16:47
  • 2
    I have a filter of height 1 and width 3, acting on an input of height 1 and width 5. If you want multiple filters you can modify out_channels to 12. If you want size 2, you can modify its width from 3 to 2. The shape would be [1, 2, 1, 12] – Olivier Moindrot Aug 09 '16 at 16:51
  • @OlivierMoindrot oh sorry, I misread the second comment. You are right, your filter is of size 3. Ok so I just need to change the out_channels to 12 (in_channel dimension in still a mystery to me though or why we choose out channel to be the number of filters). Anyway, (hopefully) last question and I think I might have a working copy of what I need. If I have a 60,000 x 784 matrix where 60,000 are the data set points and 784 are the 1D signal vectors. How do I reshape it to make it be in the correct format to do 1D convolution on it? – Charlie Parker Aug 09 '16 at 17:17
  • 1
    The in_channel matches the in_channel of the input. In your case you should reshape to [batch_size, 1, 784, 1]. The filter can be [1, 2, 1, 12] with width 2 and 12 filters. The output dim will be [batch_size, 1, 784, 12] with padding "SAME" – Olivier Moindrot Aug 09 '16 at 17:33
  • @OlivierMoindrot excellent, everything seems to be working out, now I only need to also control the stride. I want it to skip 1 number when applying the conv. i.e. rather than applying it 1 patch then move 1 apply another, I wanted, apply 1 path skip 1 apply patch. I tried strides [1,2,1,1] (since the width is the size of the filter) and it returned no errors, so I assume that is correct (just double checking). – Charlie Parker Aug 09 '16 at 18:24
  • If the input is of size `[batch_size, 1, 784, 1]` and filter is `[1, 2, 1, 12]`, your stride should be matching the dimensions of the input: `[1, 1, 2, 1]`. Check the output shape with `output.get_shape()` to see if you are correct. – Olivier Moindrot Aug 09 '16 at 19:32
  • @OlivierMoindrot yes that worked (manually checked). How did you know it was [1,1,2,1] and not [1,2,1,1]. I thought we need to match the dimension of the filter, where we are doing the conv. Clearly I was wrong. Idk why. Thanks for the help so far though! – Charlie Parker Aug 09 '16 at 21:50
  • @Pinocchio the stride is over the input, not the filter. Maybe check CS231n from Stanford to read again the part about convolutions – Olivier Moindrot Aug 10 '16 at 08:21
  • @OlivierMoindrot nvm it was just terminology, I understand the stride. Sorry for ongoing questions but for 1D convolution, how do you decide how many biases to add. I thought it would always be the number of filters. One for each filter. – Charlie Parker Aug 10 '16 at 08:58
  • nvm, I just realized that I wasn't able to add a filter per weight vector because I was adding after it was flatten rather than before. Now the broadcasting worked fine! So I assume that was it for the issue with the biases. – Charlie Parker Aug 10 '16 at 09:03
  • 2
    @Pinocchio: I have added a documentation on 1D convolution – Olivier Moindrot Aug 13 '16 at 18:50
  • @OlivierMoindrot sorry for extending the conv further but a few months later I decided to looked at your example again and I saw that your first tensor has "a length `10`, and dimension `16`". If the input tensor is a in 2D, how is that example of a 1D convolution? Shouldn't the input just be of shape `[bach_size, 1, D, 1]` ? Thats what I would have expected for a 1D conv. Do you mind clarifying that? – Charlie Parker Sep 19 '16 at 23:17
  • In your example D is the length of the input, 10. But in my example, each of these 10 input contains 16 numbers. For instance if you have a sentence of length 10, with word embeddings (of size 16) as input: you have 10*16 dimension as input. – Olivier Moindrot Sep 19 '16 at 23:22
5

In the new versions of TF (starting from 0.11) you have conv1d, so there is no need to use 2d convolution to do 1d convolution. Here is a simple example of how to use conv1d:

import tensorflow as tf
i = tf.constant([1, 0, 2, 3, 0, 1, 1], dtype=tf.float32, name='i')
k = tf.constant([2, 1, 3], dtype=tf.float32, name='k')

data   = tf.reshape(i, [1, int(i.shape[0]), 1], name='data')
kernel = tf.reshape(k, [int(k.shape[0]), 1, 1], name='kernel')

res = tf.squeeze(tf.nn.conv1d(data, kernel, stride=1, padding='VALID'))
with tf.Session() as sess:
    print sess.run(res)

To understand how conv1d is calculates, take a look at various examples

Richard
  • 56,349
  • 34
  • 180
  • 251
Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • Hi Salvador, for conv2d and conv3d, there are corresponding conv2d_transpose and conv3d_transpose. How about conv1d_transpose? Are there any implementation for conv1d_transpose? Thanks. – user288609 Jan 17 '18 at 23:09
2

I think I got it to work with the requirements that I needed. The comments/details of how it works are on the code:

import numpy as np

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

task_name = 'task_MNIST_flat_auto_encoder'
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
X_train, Y_train = mnist.train.images, mnist.train.labels # N x D
X_cv, Y_cv = mnist.validation.images, mnist.validation.labels
X_test, Y_test = mnist.test.images, mnist.test.labels

# data shape is "[batch, in_height, in_width, in_channels]",
# X_train = N x D
N, D = X_train.shape
# think of it as N images with height 1 and width D.
X_train = X_train.reshape(N,1,D,1)
x = tf.placeholder(tf.float32, shape=[None,1,D,1], name='x-input')
#x = tf.Variable( X_train , name='x-input')
# filter shape is "[filter_height, filter_width, in_channels, out_channels]"
filter_size, nb_filters = 10, 12 # filter_size , number of hidden units/units
# think of it as having nb_filters number of filters, each of size filter_size
W = tf.Variable( tf.truncated_normal(shape=[1, filter_size, 1,nb_filters], stddev=0.1) )
stride_convd1 = 2 # controls the stride for 1D convolution
conv = tf.nn.conv2d(input=x, filter=W, strides=[1, 1, stride_convd1, 1], padding="SAME", name="conv")

with tf.Session() as sess:
    sess.run( tf.initialize_all_variables() )
    sess.run(fetches=conv, feed_dict={x:X_train})

thanks to Olivier for the help (see the discussion in his comments for further clarification).


Manually check it:

X_train_org = np.array([[0,1,2,3]])
N, D = X_train_org.shape
X_train_1d = X_train_org.reshape(N,1,D,1)
#X_train = tf.constant( X_train_org )
# think of it as N images with height 1 and width D.
xx = tf.placeholder(tf.float32, shape=[None,1,D,1], name='xx-input')
#x = tf.Variable( X_train , name='x-input')
# filter shape is "[filter_height, filter_width, in_channels, out_channels]"
filter_size, nb_filters = 2, 2 # filter_size , number of hidden units/units
# think of it as having nb_filters number of filters, each of size filter_size
filter_w = np.array([[1,3],[2,4]]).reshape(1,filter_size,1,nb_filters)
#W = tf.Variable( tf.truncated_normal(shape=[1,filter_size,1,nb_filters], stddev=0.1) )
W = tf.Variable( tf.constant(filter_w, dtype=tf.float32) )
stride_convd1 = 2 # controls the stride for 1D convolution
conv = tf.nn.conv2d(input=xx, filter=W, strides=[1, 1, stride_convd1, 1], padding="SAME", name="conv")

#C = tf.constant( (np.array([[4,3,2,1]]).T).reshape(1,1,1,4) , dtype=tf.float32 ) #
#tf.reshape( conv , [])
#y_tf = tf.matmul(conv, C)


##
x = tf.placeholder(tf.float32, shape=[None,D], name='x-input') # N x 4
W1 = tf.Variable( tf.constant( np.array([[1,2,0,0],[3,4,0,0]]).T, dtype=tf.float32 ) ) # 2 x 4
y1 = tf.matmul(x,W1) # N x 2 = N x 4 x 4 x 2
W2 = tf.Variable( tf.constant( np.array([[0,0,1,2],[0,0,3,4]]).T, dtype=tf.float32 ))
y2 = tf.matmul(x,W2) # N x 2 = N x 4 x 4 x 2
C1 = tf.constant( np.array([[4,3]]).T, dtype=tf.float32 ) # 1 x 2
C2 = tf.constant( np.array([[2,1]]).T, dtype=tf.float32 )

p1 = tf.matmul(y1,C1)
p2 = tf.matmul(y2,C2)
y = p1 + p2
with tf.Session() as sess:
    sess.run( tf.initialize_all_variables() )
    print 'manual conv'
    print sess.run(fetches=y1, feed_dict={x:X_train_org})
    print sess.run(fetches=y2, feed_dict={x:X_train_org})
    #print sess.run(fetches=y, feed_dict={x:X_train_org})
    print 'tf conv'
    print sess.run(fetches=conv, feed_dict={xx:X_train_1d})
    #print sess.run(fetches=y_tf, feed_dict={xx:X_train_1d})

outputs:

manual conv
[[ 2.  4.]]
[[  8.  18.]]
tf conv
[[[[  2.   4.]
   [  8.  18.]]]]
Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
  • note that to add the biases, just create var bias with the same number of filters as nb_filters. Then just add them and the broadcasting should do the adding for you. Note, don't add the bias after flattening, do it before, or you won't have a bias per filter. – Charlie Parker Aug 10 '16 at 09:04