0

Ok, I took this code from some exercises of Intel for MKL, the code creates 2 matrixes 4x4 and multiplies them by using "cblas_dgemm". As result, the code gives you the time the program took to multiply those matrixes, so, I comprehend in some way the code, however, I dont know how to SEE the elements of the matrices A,B, and C(the result of the multiplication). Can you provide mee some code to print the values(elements of the matrices) of A,B and C? Any hint would be appreciated, thanks.

This the part in which the matrices are populated.

printf("Initialize the matrices\n");
    init_matrix<double>(A, m, k);
    init_matrix<double>(B, k,n);
    printf("Initialize the results matrix to 0s\n");
    init_matrix<double>(C, m,n,0);

This is the complete code. I would like to do something like cout<<C[i][j] in a for loop to print the elements of the matrices before and after the multiplication(cblas_dgemm).

#define min(x,y) (((x) < (y)) ? (x) : (y))
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include "mkl.h"
#include <chrono>

using namespace std;
using namespace std::chrono;



template <class T>
void init_matrix(T* a, int m, int n, T value = 1) {
    for (int i = 0;i < m * n;i++) {
        if (value == 1) {
            a[i] = (T)(rand() % 100);
        }
        else {
            a[i] = 0;
        }
    }
}

auto get_time() {
    return std::chrono::high_resolution_clock::now();
}

double random(float i, float j) {
    return ((float(rand()) * j) / float((RAND_MAX)) * i);
    }



int main()
{
    int m, n, k;
    m = 4, k = 4, n = 4;
    
    double* A, * B, * C;
    
    double alpha, beta; 

    alpha = 1.0; beta = 0.0;
    printf(" Allocating memory for matrices aligned on 64-byte boundary for better \n"
        " performance \n\n");

    A = (double*)mkl_malloc(m * k * sizeof(double), 64);
    B = (double*)mkl_malloc(k * n * sizeof(double), 64);
    C = (double*)mkl_malloc(m * n * sizeof(double), 64);

    printf("Initialize the matrices\n");
    //init_matrix<double>(A, m, k);
    init_matrix<double>(B, k,n);
    printf("Initialize the results matrix to 0s\n");
    init_matrix<double>(C, m,n,0);
    
    printf("\ncompute the execution time\n");
    auto mkl_t1 = get_time();
    cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, A, k, B, n, beta, C, n);
    auto mkl_t2 = get_time();

    printf("Execution time\n");
    auto mkl_time_span = duration_cast<duration<double>>(mkl_t2 - mkl_t1);
    cout << "Elapsed time MKL: " << mkl_time_span.count() << " s" << endl;

    //mkl_free(A);
    //mkl_free(B);
    //mkl_free(C);

    return 0;
    system("pause");
}


  • 5
    "I would like to do something like cout in a for loop..." Okay, try doing that? What obstacles did you encounter attempting this? – Nathan Pierson Jan 22 '23 at 02:42
  • https://stackoverflow.com/questions/37584648/accessing-array-elements-in-c – DYZ Jan 22 '23 at 02:42
  • @NathanPierson; When I execute this, I got the memory address cout << "VALUE: " << C; //VALUE: 000001EF15478B4 And when I do this, I got an error: for (int i = 0;i <= 4;i++) { for (int j = 0;j <= 4;j++) { cout << C[i][j]<<" "; } } It says: Expression must have point-to-object type but it has type "double"... subscripted values is not an array, pointer, or vector – Anthony J. B. Jan 22 '23 at 02:47
  • Well, that's because `C` is a `double*`. Printing it to the console prints the address it points to. It looks like this code is following the approach a lot of people suggest for matrices: Instead of actually having an array of arrays, e.g. a `double[4][4]`, you have a single array storing all the elements of the matrix. Does the material you're using explain how to go from a `(row, col)` pair to an index within the array? – Nathan Pierson Jan 22 '23 at 02:50
  • @NathanPierson yes you are right, the matrices for this exercise are arrays indeed, and each row follows the previous one in a 2D representation. thanks for your comments. To see the values was possible only with a for loop per matrix. Now I need to change that somehow. there is not precise documentation about the implementation unforunally. – Anthony J. B. Jan 22 '23 at 03:09
  • There is a lot of side information in your question, but it looks like it boils down to how can you see the 16 elements allocated by the statement `C = (double*)mkl_malloc(16 * sizeof(double), 64);`. (Did you try something similar to `init_matrix`, but instead of setting elements (`a[i]`) of the matrix, you stream them to `std::cout`?) Is there a reason to include the code for doing more than initializing then printing a single matrix (cf. [mre])? – JaMiT Jan 22 '23 at 05:45
  • `int main() { int i = 10; int* p = &i; void const* vp = p; cout << vp << "\n"; }` – Eljay Jan 22 '23 at 13:00

1 Answers1

-1

You could refer to the file under the below path regarding CBLAS_DGEMM example and prints the output data. Intel\oneAPI\mkl\2023.0.0\examples\examples_core_c\c\blas\source\cblas_dgemmx.c In addition, you could refer to the below link for more details regarding the usage of the routine.

https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/blas-and-sparse-blas-routines/blas-routines/blas-level-3-routines/cblas-gemm.html

  • From [answer]: *"Always quote the most relevant part of an important link, in case the external resource is unreachable or goes permanently offline."* In other words, your answer should still have value if the links go dead. – JaMiT Jan 28 '23 at 06:27