First, it would be appropriate to consider how the Eigen library is handling the matrix multiplication.
Then, a matrix(mxn)-vector(nx1) multiplication without Eigen could be written like this:
1 void mxv(int m, int n, double* a, double* b, double* c)
2 { //a=bxc
3 int i, j;
4
5 for (i=0; i<m; i++)
6 {
7 a[i] = 0.0;
8 for (j=0; j<n; j++)
9 a[i] += b[i*n+j]*c[j];
10 }
11 }
As you can see, since no two products compute the same element of the result vector a[] and since the order in which the values for the elements a[i] for i=0...m are calculated does not affect the correctness of the answer, these computations can be carried out independently over the index value of i.
Then a loop like the previous one is entirely parallelizable. It would be relatively straightforward using OpenMP for parallel-implementation purposes on such loops.