How can I extract the cosine transform formula used for 2D by scipy.fft.dct?

Question

For the 1D cosine transform the documentation is clear in here, and I can reproduce it easily:

The formula is:

and here it is reproduced manually, for example for the third harmonic:

import numpy as np
from scipy.fft import fft, dct

y = np.array([10,3,5])

# dct() call:
print(dct(y, 2))

# manual reproduction for k = 2:
N = 3
k = 2
n = np.array([0,1,2])
2 * (
    y[0] * np.cos((np.pi*k*(2*n[0]+1))/(2*N)) + 
    y[1] * np.cos((np.pi*k*(2*n[1]+1))/(2*N)) + 
    y[2] * np.cos((np.pi*k*(2*n[2]+1))/(2*N)))

in both cases I get 9.

But there is no documentation for the 2D DCT, and my attempts at hacking the formula with a toy matrix are not working out:

Compare:

z = np.array([[ 2,  3       ],
             [ 10, 15]])
dct(z, axis=0)  # dct() call

to for instance:

N = 2
M = 2
k = 0
l = 0
n = np.array([0,1])
m = np.array([0,1])
M*N * (
    z[0,0] * np.cos((np.pi*k*(2*n[0]+1))/(2*N)) * np.cos((np.pi*l*(2*m[0]+1))/(2*M)) + 
    z[0,1] * np.cos((np.pi*k*(2*n[0]+1))/(2*N)) * np.cos((np.pi*l*(2*m[1]+1))/(2*M)) +
    z[1,0] * np.cos((np.pi*k*(2*n[1]+1))/(2*N)) * np.cos((np.pi*l*(2*m[0]+1))/(2*M)) +
    z[1,1] * np.cos((np.pi*k*(2*n[1]+1))/(2*N)) * np.cos((np.pi*l*(2*m[1]+1))/(2*M))
)

for the first coefficient.

Can anyone help me reconcile the output of dct() with my attempted manual calculation?

I guess the formula is not...

but it would be really easy to correct if I could get the same output manually for one of the coefficients on the example matrix above.

score 1 · Accepted Answer · answered May 05 '21 at 19:00

1

You can't find the formula for the multidimensional mode because the function doesn't do multidimensional cosine transforms. The axis keyword should be suspicious: in NumPy, SciPy it typically determines the direction along which lower-dimensional operations should be performed.

In other words, dct(z, axis=0) is just a columnwise 1d cosine transform:

import numpy as np
from scipy.fft import dct

z = np.array([[ 2,  3],
             [ 10, 15]])
print(dct(z, axis=0))  # dct() call
print(np.array([dct(column) for column in z.T]).T)
# both outputs
# [[ 24.          36.        ]
#  [-11.3137085  -16.97056275]]

Note the two transposes on the last line: all that it did was first loop over the array to slice it according to columns, then join them again columnwise. The latter would probably be better spelled out as

res = np.stack([dct(column) for column in z.T], axis=1)

answered May 05 '21 at 19:00

Andras Deak -- Слава Україні

33,737
11
83
111

Thank you very much. My motivation for the question is to eventually understand what is being done in [this answer](https://stats.stackexchange.com/a/311357/311006) with the inverse of the cosine transform using this function. I don't know if you could shed some light on it... – Antoni Parellada May 05 '21 at 19:06
If I understand your answer properly, the dct() call just does a 1D cosine transform for each column independently, which explains the output in matrix form. – Antoni Parellada May 05 '21 at 19:07
1

@Mathtourist Yes, `dct` just does a 1d cosine transform over columns (or rows, depending on `axis`) for a 2d array. The linked answer calls `idct` twice! That's how it implements a 2d `idct`. – Andras Deak -- Слава Україні May 05 '21 at 19:13
One last question, if I may, would the equation in the last part of the original post be correct, then (doing it twice)? – Antoni Parellada May 05 '21 at 19:16
@Mathtourist No, I think the correct formula has to use a 2d array. So in the sum you have `x_{nm}` (instead of `x_n y_m`), but the rest of the formula is probably correct. If you pull out one of the factors of 2 and one of the sums, you should see the formula reflecting `dct(dct(z, axis=0), axis=1)` or something along those lines. The formula should be very transparent in this regard. Let me know if I didn't explain it clearly enough. – Andras Deak -- Слава Україні May 05 '21 at 19:21
I got it working with 4 as the "normalizing" factor, using `dct(dct(z.T, axis=0).T, axis=0)`. Do you think it is correct, in light of this discussion, to apply `dct()` only once to the identity matrix and get the discrete cosine transform matrix as in [here](https://stackoverflow.com/a/53876150/15183594)? – Antoni Parellada May 05 '21 at 19:54
I would, in particular, expect identical results applied to the identity if I did apply it twice, as though it was a regular matrix as in the example above, but this is not the case: `dct(dct(np.eye(4).T, axis=0).T, axis=0)` is different to `dct(np.eye(4), axis=0)`. So in the answer to my prior comment the DCT matrix is different than if calculated applying `dct()` twice as in the post in crossvalidated... Isn't that surprising? I presume then that something is wrong with [this answer](https://stackoverflow.com/a/53876150/15183594). They probably used the function incorrectly? – Antoni Parellada May 05 '21 at 20:08
@Mathtourist OK, but that's just a verbose way of doing `dct(dct(z, axis=1), axis=0)` ;) If I understand correctly the idea behind the matrix is to turn the `dct` call into a matrix multiplication: from `x_j` you go to `y_i` with `M_{ij} x_j`. This doesn't generalize to the 2d case: you can't (easily?) transform both indices with a single matrix product. You could, however, multiply by the same matrix twice: take `M = dct(np.eye(z.shape[0]), axis=0)` and then `dct(dct(z, axis=1), axis=0)` should be the same as `M @ z @ M.T`. – Andras Deak -- Слава Україні May 05 '21 at 20:12
1

In other words, to define a "matrix" (tensor, linear transformation) that does the 2d DCT for you you'd need a _four-index_ quantity that can be applied to the indices of the original array. – Andras Deak -- Слава Україні May 05 '21 at 20:16

How can I extract the cosine transform formula used for 2D by scipy.fft.dct?

1 Answers1