0

I wanted to perform hashing of a stream of input messages in multithreading, so was trying to implement std::vector<std::future<HashData>> futures; but not sure as how many future objects can exist in a system, at a time simultaneously.

std::vector<std::future<HashData>> futures;
std::vector<std::string> messages;

for (int i = 0; i < messages.size(); i++)
{
  std::promise<HashData> promiseHashData;
  std::future<HashData> futureHashData = promiseHashData.get_future();
  futures.emplace_back(std::move(futureHashData));
  std::async(std::launch::async, [&]() {PerformHash(std::move(promiseHashData), messages[i]);});
}

std::vector<HashData> vectorOfHashData;
// wait for  all async tasks to complete
for (auto& futureObj : futures)
{
  vectorOfHashData.push_back(futureObj.get());
}

Is there any limit for creation of future objects in a system (similar to how system may reach thread saturation level, if the existing threads won't get destroyed and new ones gets created continuously), As i will be calling PerformHash() method in async manner for large data of messages.

i am exploring concurrency in c++ during recent times and wanted to improve the hashing task performance. So this thought came to my mind, but not sure as whether it will work or not. wanted to know if i am missing something here.

devilsEye
  • 1
  • 1
  • 6
    Technically yes there is a limit but the amount of RAM in your system is going to be the real limitation. – NathanOliver Feb 02 '23 at 13:01
  • Why do you expect there to be one? – Passer By Feb 02 '23 at 13:02
  • Why do you believe that there must be some kind of an explicit limit? – Sam Varshavchik Feb 02 '23 at 13:05
  • Questioner asks “ Is there any limit for future objects to be stored in futures vector in this case”. He’s asking, not stating a belief. – Jeremy Friesner Feb 02 '23 at 13:18
  • @SamVarshavchik if we create lot of threads continuously and do not destroy them, The system may reach threads saturation condition, where it can't create new threads based on it's hardware configurations. was wondering, Does this applies to future objects as well in the above case. – devilsEye Feb 02 '23 at 13:21
  • 1
    This is unspecified in the C++ standard. In general, a `std::future` does not represent a limited resource that has constraints on it, and a `std::vector` only cares if there's enough memory, for whatever's in the vector. – Sam Varshavchik Feb 02 '23 at 13:23
  • Offtopic: IMHO `std::async(std::launch::async, &PerformHash, std::move(promiseHashData), std::ref(essages[i]));` looks better – Marek R Feb 02 '23 at 13:40
  • I changed your tags, and I changed the title of your question. This question should not have been tagged with [tag:multithreading], or [tag:sha256], or [tag:std-future]. The answer depends only on how many objects of a certain size a `std::vector` can hold. The purpose of tags is to facilitate search. Knowing the type of object that you are trying to put into the vector might help somebody who has found your question and is trying to answer it for you, but it is no help for anybody who is searching for similar questions. – Solomon Slow Feb 02 '23 at 13:57
  • @SolomonSlow Well, actually my question is more of about how many future objects can exist in a system at a time simultaneously, rather than the capacity of vector in this case. Will rephrase the question. – devilsEye Feb 02 '23 at 14:05

1 Answers1

0

The problem isn't going to be "how many futures can a vector hold"; futures (on most systems) are just a shared pointer to a block of memory with some cheap concurrency primitives in it.

The problem is you are creating a thread per future then blocking forward progress until the thread is finished. If you fix that problem, then your code is using dangling references.

std::vector<std::future<HashData>> futures;
std::vector<std::string> messages;

for (int i = 0; i < messages.size(); i++)
{
  std::promise<HashData> promiseHashData;
  std::future<HashData> futureHashData = promiseHashData.get_future();
  futures.emplace_back(std::move(futureHashData));
  // this captures a promiseHashData by reference
  // It also creates a thread, then blocks until the
  // thread finishes.
  std::async(std::launch::async, [&]() {PerformHash(std::move(promiseHashData), messages[i]);});
}

So a few points:

  1. Unless the hash data is worth consuming in small pieces, a future<vector<HashData>> is going to be more efficient.

  2. If you want a vector<future>, you'll also want a vector<promise>. Then create a bounded number of threads (or get them from a pool you write) and fullfill those promises.

Creating an unbounded number of futures, then creating an unbounded number of threads to service those futures, is a bad plan.

Finally, std::async is funny in that it returns a std::future itself. When that future is destroyed, it blocks on the completion of the thread it creates. This is atypical behavior, but it prevents losing track of a thread of execution.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524