I've been skimming through the Armadillo documentation and examples, but it seems there is no real efficient way to subsample (or resample) a large vector or matrix, such that if you had N elements originally, you end up with N / k elements. There are a few methods to shuffle and shift but that's about it.
So I'm just looping over all elements sequentially, but surely there has to be a better way besides vectorizing over the available cores?
bool subsample(config& cfg, arma::mat& data, int skippCount)
{
const auto processor_count = 1; // currently not using threading because 'inplace'
const size_t cols = data.n_cols;
const size_t period = skippCount + 1 ;
size_t newCols = cols / period;
newCols += (0 == (cols % period)) ? 0 : 1;
const size_t blockSize = 256;
std::vector<thread> workers;
for (size_t blockID = 0; blockID < newCols / blockSize; ++blockID) {
workers.push_back(std::thread([&data, blockID, newCols, period]() {
// copy blockSize elements inplace (overwrites other entries))
size_t c = blockID * blockSize;
for (size_t b = 0; (c < newCols) && (b < blockSize); c++, b++) {
arma::vec v = data.col(period * c);
data.col(c) = v;
}
}));
if (workers.size()==processor_count) {
for (auto& thread : workers) thread.join();
workers.clear();
}
}
for (auto& thread : workers) thread.join(); // make sure all threads finish
data.resize(data.n_rows, newCols);
return true;
}
If you have any suggestions to improve on this, it would be greatly appreciated. Also it would be nice to do this 'inplace' to save on memory.