1

If I have a tensor contraction A[a,b] * B[b,c,d] = C[a,c,d] which has the properties B[b,c,d] = B[b,d,c] and C[a,c,d] = C[a,d,c], how to set up BLAS to utilize this symmetry?

Here the Einstein summation notation is assumed, i.e., repeated indices mean summation.

sgemm http://www.netlib.org/lapack/explore-html/db/dc9/group__single__blas__level3_gafe51bacb54592ff5de056acabd83c260.html#gafe51bacb54592ff5de056acabd83c260 seems about the symmetry of a matrix, than rank-3 tensor.

I could try to flat/reshape tensor B into a lower dimension array, but seems flat/reshape tensor also takes time, at least in Fortran. How to speed up reshape in higher rank tensor contraction by BLAS in Fortran?

AlphaF20
  • 583
  • 6
  • 14
  • Why do you need to exploit the symmetry? The performance of DSYMM is often little different from DGEMM, so why not just use DGEMM directly as in the second linked answer? – Ian Bush Mar 15 '21 at 08:01
  • I thought if there is a symmetry, I can save a factor of 2 for the computational time. If it is not the case with BLAS, then fine. – AlphaF20 Mar 15 '21 at 13:13
  • 1
    Though there is symmetry in one of the input matrices, there is not necessarily any in the output (https://math.stackexchange.com/questions/3376666/multiplication-of-two-symmetric-matrices-may-not-be-symmetric/3376671) Thus you still need to generate the same number of results. Thus the number of operations is the same. Thus to zeroth order the time is the same. The main use of the symmetric routines is to avoid accessing one half of the matrix, possibly using that half for other reasons – Ian Bush Mar 15 '21 at 13:30
  • 1
    I think in the present case, the symmetry is preserved. ```C[a,d,c] = A[a,b]*B[b,d,c] = A[a,b]*B[b,c,d] = C[a,c,d]```, assumes the Einstein summation notation. If I use nested loops, I can use ```for a = 1, na; for b=1, nb; for c=1, nc; for d=1,nd ``` or ```for a = 1, na; for b=1, nb; for c=1, nc; for d=1,nc ``` then copy the next half of array ```C``` in indices ```c``` and ```d```. – AlphaF20 Mar 15 '21 at 16:02

1 Answers1

3

The matrix operation C_{acd} = A_{ab} . B_{bcd} can be written programmatically as a double loop of matrix * vector operations (using matmul for clarity; replace with BLAS as desired):

n = size(B,3) ! = size(B,2)
do d=1,n
  do c=1,n
    C(:,c,d) = matmul(A(:,:), B(:,c,d))
  enddo
enddo

Since "C[a,d,c]=C[a,c,d]", the square loop of matmul can be replaced with a triangular loop of matmul and a triangular loop of just copying, as:

n = size(B,3) ! = size(B,2)
do d=1,n
  do c=1,d
    C(:,c,d) = matmul(A(:,:), B(:,c,d))
  enddo
  
  do c=d+1,n
    C(:,c,d) = C(:,d,c)
  enddo
enddo

This exploits symmetry to reduce the number of BLAS operations, improving performance, but having to do lots of matrix * vector multiplications rather than one big matrix * matrix multiplication will worsen performance. Will this approach overall improve or reduce performance? The best way to find that out is probably to try it and see.

veryreverie
  • 2,871
  • 2
  • 13
  • 26
  • Thanks again for your answer. In using DGEMM, is there any way to specify the range of indices being used in the array ```B```? By the size of ```K``` in https://stackoverflow.com/questions/66296334/openmp-with-blas? Should I account from the first index of ```B```? If the multiplication is ```B(c,:,d)```, is there any way to specify in DGEMM? I tried to introduce smaller dimension arrays, e.g., ```B_small(:) = B(:,c,d)```, ```C_small(:) = C(:,c,d)```, manipulating new arrays take quite some time – AlphaF20 Apr 01 '21 at 08:01
  • @AlphaF20 can you ask this question as a new question please – veryreverie Apr 01 '21 at 17:25
  • 1
    Yes! I will do it! – AlphaF20 Apr 01 '21 at 17:47
  • Posted in https://stackoverflow.com/questions/66909766/blas-with-a-selected-dimension-in-array, not sure if I have expressed my question clearly – AlphaF20 Apr 01 '21 at 18:27