4

I have

vector < vector < int > > data_mat ( 3, vector < int > (4) );
vector          < int >   data_vec ( 3                     ); 

where data_mat can be thought of as a matrix and data_vec as a column vector, and I'm looking for a way to compute the inner product of every column of data_mat with data_vec, and store it in another vector < int > data_out (4).

The example http://liveworkspace.org/code/2bW3X5%241 using for_each and transform, can be used to compute column sums of a matrix:

sum=vector<int> (data_mat[0].size());
for_each(data_mat.begin(), data_mat.end(),
         [&](const std::vector<int>& c) {
            std::transform(c.begin(), c.end(), sum.begin(), sum.begin(),
                           [](int d1, double d2) 
                             { return d1 + d2; }
                          );
            }
        );

Is it possible, in a similar way (or in a slightly different way that uses STL functions), to compute column dot products of matrix columns with a vector?

The problem is that the 'd2 = d1 + d2' trick does not work here in the column inner product case -- if there is a way to include a d3 as well that would solve it ( d3 = d3 + d1 * d2 ) but ternary functions do not seem to exist in transform.

alle_meije
  • 2,424
  • 1
  • 19
  • 40
  • [`inner_product`](http://en.cppreference.com/w/cpp/algorithm/inner_product)? – Peter Wood Mar 04 '13 at 22:43
  • Npte, if you want to do serious math then it's probably time to dump the stl and go with a C library like the [GSL](http://www.gnu.org/software/gsl/). That the stl would even be getting in to this sort of thing surprises me, the speed difference is probably 2 full orders of magnitude... – Matt Phillips Mar 04 '13 at 23:07
  • I'm not sure I understand what you're trying to achieve. You talk about dot products, but I see no multiplication in here – Andy Prowl Mar 04 '13 at 23:07
  • I don't have the code for a dot product, as explained in the text below the code example for a column sum (see the multiplication in the text). AFAIK there is no way to do `{ d3 + d1 * d2; }` in transform. – alle_meije Mar 05 '13 at 15:02
  • 2
    Well, no need to go to any C library, no problem with the STL, but *Matt* is at least right in that if you want to do serious maths, then at least drop this awfully inappropriate vector of vectors and use a proper matrix data structure (that stores its internal data **contiguously**). Whoever told you that dynamic arrays of dynamic arrays are a good way to represent 2-dimensional arrays **is plain wrong**. Other than that still an interesting question, though. – Christian Rau Mar 05 '13 at 16:33
  • Point taken and understood @ChristianRau!To be honest it is part of a proof of concept to others about using the STL, and the fact that you can, if you absolutely insist, use those iterators for almost anything. I am stuck with this one but I'm interested whether it can be done. The simplest way to do matrices I guess is this http://stackoverflow.com/questions/12657811 -- which I will exclusively use after demonstrating the hopelessly inefficient / inappropriate / inert / etc way. – alle_meije Mar 06 '13 at 12:48
  • @alle_meije In fact you can use the standard library algorithms fine with any datastructure you like, not neccessarily with standard containers only, or let your storage/layout-optimized matrix datastructure provide its own iterators for use with the standard algorithms. It's just the `vector` that is rubbish, not the general `transform` or `` approach. And like said, the idea of achieving such tasks as the one in the question with STL algorithms only is a really nice one, even if just of theoretical relevance in this particular case. – Christian Rau Mar 06 '13 at 13:18
  • @alle_meije About your bounty: I only show the one that's on here now. You can award it soon if you wish. There is no prior bounty on this question. Thanks! – Andrew Barber Apr 24 '13 at 08:14

1 Answers1

5

In fact you can use your existing column sum approach nearly one to one. You don't need a ternary std::transform as inner loop because the factor you scale the matrix rows with before summing them up is constant for each row, since it is the row value from the column vector and that iterates together with the matrix rows and thus the outer std::for_each.

So what we need to do is iterate over the rows of the matrix and multiply each complete row by the corresponding value in the column vector and add that scaled row to the sum vector. But unfortunately for this we would need a std::for_each function that simultaneously iterates over two ranges, the rows of the matrix and the rows of the column vector. To achieve this, we could use the usual unary std::for_each and just do the iteration over the column vector manually, using an additional iterator:

std::vector<int> sum(data_mat[0].size());
auto vec_iter = data_vec.begin();
std::for_each(data_mat.begin(), data_mat.end(), 
              [&](const std::vector<int>& row) {
                 int vec_value = *vec_iter++;    //manually advance vector row
                 std::transform(row.begin(), row.end(), sum.begin(), sum.begin(),
                                [=](int a, int b) { return a*vec_value + b; });
              });

The additional manual iteration inside the std::for_each isn't really that idiomatic use of the standard library algorithms, but unfortunately there is no binary std::for_each we could use.


Another option would be to use std::transform as outer loop (which can iterate over two ranges), but we don't really compute a single value in each outer iteration to return, so we would have to just return some dummy value from the outer lambda and throw it away by using some kind of dummy output iterator. That wouldn't be the cleanest solution either:

//output iterator that just discards any output
struct discard_iterator : std::iterator<std::output_iterator_tag,
                                        void, void, void, void>
{
    discard_iterator& operator*() { return *this; }
    discard_iterator& operator++() { return *this; }
    discard_iterator& operator++(int) { return *this; }
    template<typename T> discard_iterator& operator=(T&&) { return *this; }
};

//iterate over rows of matrix and vector, misusing transform as binary for_each
std::vector<int> sum(data_mat[0].size());
std::transform(data_mat.begin(), data_mat.end(), 
               data_vec.begin(), discard_iterator(), 
               [&](const std::vector<int>& row, int vec_value) {
                   return std::transform(row.begin(), row.end(), 
                                         sum.begin(), sum.begin(),
                                         [=](int a, int b) { 
                                             return a*vec_value + b;
                                         });
               });

EDIT: Although this has already been discussed in comments and I understand (and appreciate) the theoretic nature of the question, I will still include the suggestion that in practice a dynamic array of dynamic arrays is an awfull way to represent such a structurally well-defined 2D array like a matrix. A proper matrix data structure (which stores its contents contigously) with the appropriate operators is nearly always a better choice. But nevertheless due to their genericity you can still use the standard library algorithms for working with such a custom datastructure (maybe even by letting the matrix type provide its own iterators).

Christian Rau
  • 45,360
  • 10
  • 108
  • 185
  • Thanks! I had not thought of the idea of creating an additional vector and accessing it from 'inside'. In the application I want to demonstrate, the rows are numbered images and the columns are pixel indices -- in that case things are not actually too bad because large blocks are still contiguous. Even then it is still true that if you know enough about your data to define inner products, data structures that benefit from such knowledge are always more efficient. – alle_meije Mar 06 '13 at 15:23
  • @alle_meije Yeah, it doesn't make sense. But wait, I'll rework it, in the end we don't even need a temporary vector, but a binary `for_each`. – Christian Rau Apr 23 '13 at 14:33
  • @alle_meije Ok, updated the answer. The fact that `std::for_each` cannot iterate over two ranges simultaneously doesn't allow for a completely clean and streamlined STL solution, but it frees us from the need for the additional temporary vector that my first (incorrect) solution used. – Christian Rau Apr 23 '13 at 15:04
  • This is brilliant! Thanks Christian, I secretly like the look of manual tampering with an 'official' iterator :). – alle_meije Apr 25 '13 at 07:31
  • this is the program in which I use the code: https://github.com/amwink/bias/blob/master/cpp/fastecm/fastecm.cpp – alle_meije Nov 18 '14 at 15:47