1

I have entries like in a phone book: name + address. The source is on a web site, the count is over 1K records.

Question is:

How do i use/implement ConcurrentDictionary with ParallelForeach?

I might as well ask will it better perform:

ConcurrentDictionary & ParallelForeach

vs

Dictionary & foreach

As the name is not allowed to have duplicates being the key, and i think i understood correctly that ConcurrentDictionary has its own built-in function to add(TryAdd) only if key does not exists. so the issue of not allowing adding duplicated keys already taken cared of, so from that point i could clearly see the balance is turning towards ConcurrentDictionary rather than standard-sequential Dictionary

So how do I add name & address from any given data source and load it via Parallelforeach into a ConcurrentDictionary

Dave New
  • 38,496
  • 59
  • 215
  • 394

2 Answers2

2

the count is over 1K records.

How much over 1K? Because 1K records would be added in the blink of an eye, without any need for parallelization.

Additionally, if you're fetching the data over the network, that cost will vastly dwarf the cost of adding to a dictionary. So unless you can parallelize fetching the data, there's going to be no point in making the code more complicated to add the data to the dictionary in parallel.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • @JonSkeet, hey Jon, i already made it possible to fetch via extended WebClient to feature some extra modifications on parallelism aspect , ivoked via parallel task to fetch multiple sources though this is now only one source were talking about so am i guessing right that there's actually nothing to parallel on the *fetching* part ? –  Nov 28 '12 at 07:46
  • @JbobJohan: Sounds like it, unless you start using Range headers to fetch parts of the data at a time (which will be insanely complicated). – Jon Skeet Nov 28 '12 at 09:16
  • i some Times like insanely complicated though keywords you gave in your last sentence (while i write this opened new tab in Google that via those keywords i am getting ideas)... for instance it's being used by browsers and file down-loaders so is it some kind of method for partitioning the data while keeping a map of parts...just like used in bit torrent or such programs... so if the request is big data i could use it... going to read some more... –  Nov 28 '12 at 12:15
1

This is quiet an old question, but this might help someone:

If you are trying to chunk through the ConcurrentDictionary and do some processing:

using System.Collections.Generic;
using System.Threading.Tasks;
using System.Collections.Concurrent;

namespace ConcurrenyTests
{
    public class ConcurrentExample
    {
        ConcurrentExample()
        {
            ConcurrentDictionary<string, string> ConcurrentPairs = new ConcurrentDictionary<string, string>();

            Parallel.ForEach(ConcurrentPairs, (KeyValuePair<string, string> pair) =>
            {
                // Do Stuff with
                string key = pair.Key;
                string value = pair.Value;
            });
        }
    }
}

I don't think you would be able to use Parallel.ForEach to be able to insert into a new dictionary unless you already had an object of same length that you were iterating over. i.e. an list with the URL's of text documents you were wanting to download and insert into the dictionary. If that were the case, then you could use something along the lines of:

using System.Threading.Tasks;
using System.Collections.Concurrent;

namespace ConcurrenyTests
{
    public class ConcurrentExample
    {
        ConcurrentExample()
        {
            ConcurrentDictionary<string, string> ConcurrentPairs = new ConcurrentDictionary<string, string>();
            ConcurrentBag<string> WebAddresses = new ConcurrentBag<string>();

            Parallel.ForEach(WebAddresses, new ParallelOptions { MaxDegreeOfParallelism = 4 }, (string webAddress) =>
            {
                // Fetch from webaddress
                string webText;
                // Try Add
                ConcurrentPairs.TryAdd(webAddress, webText);

                // GetOrUpdate
                ConcurrentPairs.AddOrUpdate(webAddress, webText, (string key, string oldValue) => webText);

            });
        }
    }
}

If accessing from a webserver, you may want to increase or decrease the MaxDefreeOfParallelism so that your bandwidth is not choked.

Parallel.ForEach: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreach?view=netcore-2.2

ParallelOptions: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.paralleloptions?view=netcore-2.2

SkywalkerIsNull
  • 89
  • 1
  • 2
  • 7