Implement Multiply and adding 2 matrix by avx programming

Question

I want to implement multiply and adding 2 matrices in Visual C++ 2012 using AVX. I enable AVX(Advanced Vector Extensions (/arch:AVX)) in Visual studio. But for adding matrices when I enable this property and when I disable it, the time is same and enabling this property doesn't affect on time of running program. for 100000 iteration for 4*4 matrices. the clock time is 9. enable or disable Advanced Vector Extensions (/arch:AVX) didn't change this time. another problem is How to implement multiply 2 matrix by AVX?

void AVXadd(
          double* pArray1,                   // [in] first source array
          double* pArray2,                   // [in] second source array
          double* pResult,                   // [out] result array
          int nSize)                        // [in] size of all arrays
{
    int nLoop = (nSize*nSize)/ 4;
    //

    __m256d* pSrc1 = (__m256d*) pArray1;
    __m256d* pSrc2 = (__m256d*) pArray2;
    __m256d* pDest = (__m256d*) pResult;
    for ( int i = 0; i < nLoop; i++ )
    {   
        *pDest = _mm256_add_pd(*pSrc1,*pSrc2); //add input arrays        
        pSrc1++;
        pSrc2++;
        pDest++;
    }
}


//get datas from random matrix and test add  
void AVX(vector<vector<double>> randn,int size,int itt){
    double *A,*B,*C;
    int ARRAY_SIZE=size*size;
    //alligment arrays:
    A = (double*) _aligned_malloc(ARRAY_SIZE * sizeof(double), 32);
    B = (double*) _aligned_malloc(ARRAY_SIZE * sizeof(double), 32);
    C = (double*) _aligned_malloc(ARRAY_SIZE * sizeof(double), 32);
    //fill by random vector numbers
    for(int i=0;i<size;i++)
        for(int j=0;j<size;j++)
            A[i*size+j]=B[i*size+j]=randn[i][j];//matrix to array

    clock_t t1, t2;
    t1=clock();
    //add
    for(int i=0;i<itt;i++)
        AVXadd(A,B,C,size);
    t2=clock();

    vector<vector<double>> c(size,vector<double>(size));
    for(int i=0;i<size;i++)
        for(int j=0;j<size;j++)
            c[i][j]=C[i*size+j];//C is result
    cout<<"\t\t"<<t2-t1;

}

You might want to read up on what exactly `/arch:AVX` does. It doesn't enable/disable AVX. Nor does it attempt to emulate AVX when it's disabled. All it really does is force VEX-encoding to all the 128-bit vector operations. — Mysticial, Dec 03 '13 at 16:35
And once you have that out of the way, you'll probably find that you've fallen into the same trap that 90% of people fall into when they attempt SIMD: Too much memory access and too little computational work. — Mysticial, Dec 03 '13 at 16:36

Implement Multiply and adding 2 matrix by avx programming

0 Answers0