4

There's a bunch of compressed data chunks that has to be asynchronously deflated - without blocking or slowing down the main thread in any shape or form.

Decompressed chunks will be used by the main thread as soon as they are decompressed.

Currently I do it like this:

foreach (var chunkPair in compressedChunkData)
{                        
    var task = Task.Factory.StartNew<Chunk>(() =>
    {
        var compressedBytes = Convert.FromBase64String(chunkPair.Value);
        var chunk = Decompress(compressedBytes);
        return chunk;
    }).ContinueWith((finishedTask) =>
    {
        var chunk = finishedTask.Result;
        TaskFinishActions.Enqueue(() =>
        {
            chunk.PostSerialize();
            document.Chunks.Add(chunkPair.Key, chunk);
        });
    });
}
// By the time we get here 20ms has passed!!!

The problem is that this seems to hijack the core the mainthread is running on, this butchers the performance.

Is there a way to make TaskFactory have thread per core and context switch away from mainthread only in those brief moments when mainthread is blocked?

EDIT: the foreach loop is not the only part of the code which becomes slow, as long as there sizable amount of decompression tasks running, mainthread slows down significantly.

EDIT2: New data to decompress arrive all the time, the loop is not ran only once:

  • Lets say you have 250 items arriving in compressedChunkData first
  • Next frame you have 10 items, next 12, next 0, next 2, etc.
JBeurer
  • 1,707
  • 3
  • 19
  • 38

2 Answers2

1

You could use a custom TaskScheduler that sets the thread priority to a low value. Windows always schedules higher priority threads first.

Maybe you need to put an expiration date on tasks so that they don't queue up too much. It sounds like you have a need for low-latency processing. Each task could check as its first action whether it was scheduled more than N seconds ago, and if yes exit immediately.

An alternative design would be a producer/consumer scenario with low-priority threads taking work. I see no need for this given your requirements but it's a more flexible solution. Creating hundreds of tasks is not a problem. Each task is just a small in-memory data structure. Task != threads.

usr
  • 168,620
  • 35
  • 240
  • 369
0

Are you worried about the for loop slowing down or the code following the loop running slowly?

If you are worried about the for loop, then there is an easy solution, which you should follow in any case. You can provide an instance of ParallelOptions class to control the degree of concurrency

Parallel.ForEach(compressedChunkData, chunkPair => {
    var compressedBytes = Convert.FromBase64String(chunkPair.Value);
    var chunk = Decompress(compressedBytes);
    TaskFinishActions.Enqueue(() => {
        chunk.PostSerialize();
        document.Chunks.Add(chunkPair.Key, chunk);
    });
});

If you are worried about slowing down the code after the loop, then look at this answer from Jon Skeet. In essence, you should use async and await for that with or start the Parallel.Foreach on a separate task.

EDIT:

Let's get this clear first: On a OS like windows, there is no such thing as reserving a CPU for a thread or process. It works on time sliced scheduling). So even if your decompress threads might not block your main thread, it might still get blocked due to CPU intensive activity on other processes. The way to convey our preference to the OS is by using Priority and CPU Affinity.

There are other ways which require more manual control and hence more work.

  1. Perhaps you should have a separate process for your decompression and use Process priority and CPU Affinity to tell the OS which cores it would like to work on.
  2. You can create a scheduler class which manages a RequestQueue (Producer-Consumer). Each decompress request should be managed on a single thread (which will be assigned to a single logical CPU). Make sure your scheduler uses no more than (TOTAL_CPUS-1) (keeping one available for main thread)
Community
  • 1
  • 1
Vikhram
  • 4,294
  • 1
  • 20
  • 32
  • Parallel.ForEach blocks. That is not acceptable. Even if you put it in Task.Run(() => Parallel.ForEach(...)) that only solves the problem if you have a fixed set of compressed data. What to do if new data to decompress arrives all the time - one by one - and you still have to create new Tasks for each of them? This approach simply doesn't work then. – JBeurer Jan 25 '16 at 12:37
  • I have to say that Task.Run(() => Parallel.ForEach(...)) does work much better if you have to go through the compressedChunkData only once, not every frame/iteration. – JBeurer Jan 25 '16 at 12:44
  • @JBeurer Did you try using async/await. I will update another way in the answer – Vikhram Jan 25 '16 at 13:43