I'm currently playing a bit with std::async since I read that it performs better than std::thread. I wrote a simple program with a function ("waitsome") that takes on my computer roughly 500ms to compute. If I feed this into std::async however (and compute it 16 times instead of once) it takes a whopping 50s.
I already found out that the destructor of the future may block such that you should assure it is either assigned or moved if it is in a limited scope. hence I std::move the future into the vector holding the futures. Other than that I have no real idea. I used the "very sleepy" profiler to check on which function wastes all the time and got this image:
Please find the source code below. Platform is windows, compiler is VS2022 (invoked from vscode). Do I have a general misconception of std::async ? Essentially I want to create worker threads and get the results as std::futures.
#include <iostream>
#include <future>
#include <thread>
#include <chrono>
constexpr unsigned int highestSequence = 1000000;
void waitsome();
int main(int argc, char** argv)
{
auto startTime = std::chrono::high_resolution_clock::now();
// parallel portion
std::vector<std::future<void>> futVec;
for(unsigned int i = 0; i < 16; i++)
{
futVec.push_back(std::move(std::async(std::launch::async, &waitsome)));
}
for(unsigned int i = 0; i < futVec.size(); i++)
{
futVec.at(i).wait();
}
auto stopTime = std::chrono::high_resolution_clock::now();
std::chrono::duration<double, std::milli> computationTime = stopTime - startTime;
std::cout << "async computation took " << computationTime.count() << " ms" << std::endl;
// sequential portion
startTime = std::chrono::high_resolution_clock::now();
waitsome();
stopTime = std::chrono::high_resolution_clock::now();
std::chrono::duration<double, std::milli> computationTimeSingle = stopTime - startTime;
std::cout << "single computation took " << computationTimeSingle.count() << " ms" << std::endl;
}
void waitsome()
{
unsigned int a = 1;
for(unsigned int i = 0; i < 1000000; i++)
{
std::vector<unsigned int> myvec;
a += 2.0;
myvec.push_back(a);
}
return;
}
The output of 3 consecutive runs looks like this:
async computation took 52775.2 ms
single computation took 498.063 ms
async computation took 52890.9 ms
single computation took 502.281 ms
async computation took 52680.8 ms
single computation took 516.881 ms