0

Please this is my first time of using AVX and I'm trying to perform a simple multiplication on double precision numbers but I'm not getting all results correct.

I get just the first 4 results and the others are jargon.

#include <immintrin.h>
#include <iostream>
#include <math.h> 
#include <time.h>
using namespace std;

int main() {

    double *a, *b;                      // data pointers
    double *pA,*pB;                     // work pointer
    __m256d rA_AVX, rB_AVX;     // variables for AVX

    const int vector_size = 8;
    a = (double*) _mm_malloc (vector_size*sizeof(double),64);
    b = (double*) _mm_malloc (vector_size*sizeof(double),64);

    for(int i = 0; i < vector_size; i++) {
        a[i] = (rand() % 48);
        b[i] = 0.0f;
        cout << a[i] << endl;
    }

    for (int i = 0; i < vector_size; i += 8)
    {
        pA = a;
        pB = b;
        rA_AVX = _mm256_load_pd(pA);
        rB_AVX = _mm256_mul_pd(rA_AVX,rA_AVX);
        _mm256_store_pd(pB,rB_AVX);
        pA += 8;
        pB += 8;
    }

    for (int i=0; i<vector_size; i++){
        cout << endl << b[i] << endl;
    }
    _mm_free(a);
    _mm_free(b);

    system("PAUSE");
    return 0;
}
Mysticial
  • 464,885
  • 45
  • 335
  • 332
FrancFine
  • 27
  • 3
  • Your incrementing is not correct. As soon as you increment `pA` and `pB`, you overwrite it again on the next iteration. So it doesn't get incremented. – Mysticial Mar 15 '13 at 05:49
  • Which of the increments pls. I tried to comment out the pA+=8 buh it still doesn't work. – FrancFine Mar 15 '13 at 05:58
  • 9
    You should probably get a better grasp of basic C or C++ before you try to do anything with AVX and vectorization. – Mysticial Mar 15 '13 at 06:01
  • I know. Thanks for reminding me. Buh i need to use dis ASAP. So can u just give me a hint what to do to make it work?? thanks – FrancFine Mar 15 '13 at 06:07
  • 1
    There's actually multiple problems with the code. You can fix the increment issue by getting rid of the `pA += 8` and replacing `pA = a;` with `pA = a + i;`. (same with `pB`) But then you're skipping elements. (which I don't know if it's intentional) Otherwise, change `i += 8` to `i += 4`. – Mysticial Mar 15 '13 at 06:20
  • Out of curiosity: Why do you need to use avx? Especially with 256bit double vectors, which have to be emulated unless you somehow acquired an avx2 capabale processor (read: Haswell). – Grizzly Mar 15 '13 at 08:07
  • 1
    @Grizzly: AVX arithmetic ops are not "emulated" on SNB or IVB; only loads and stores are cracked into 2 µops. (The questioner's example happens to be store-bound, but there is absolutely good reason to use AVX on current processors in general) – Stephen Canon Mar 15 '13 at 15:45

0 Answers0