0

I have a HashMap which I'd like to add elements to as fast as possible. I tried using par_extend, but it actually ended up being slower than the serial version. My guess is that it is evaluating the iterator in parallel, but extending the collection serially. Here's my code:

use std::collections::HashMap;
use rayon::prelude::*;
use time::Instant;

fn main() {
    let n = 1e7 as i64;

    // serial version
    let mut t = Instant::now();
    let mut m = HashMap::new();
    m.extend((1..n).map(|i| (i, i)));
    println!("Time in serial version: {}", t.elapsed().as_seconds_f64());

    // parallel version - slower
    t = Instant::now();
    let mut m2 = HashMap::new();
    m2.par_extend((1..n).into_par_iter().map(|i| (i, i)));
    println!("Time in parallel version: {}", t.elapsed().as_seconds_f64());
}

Is there a faster way to extend a HashMap that actually adds the elements in parallel? Or a similar data structure that can be extended in parallel? I know this would run faster with something like an FnvHashMap, but it seems like it should also be possible to speed this up with parallelism. (and yes, I'm compiling with --release)

Daniel Giger
  • 2,023
  • 21
  • 20
  • 2
    `par_extend` doesn’t extend the `HashMap` in parallel: that’s impossible, as mutating a `HashMap` requires exclusive access. Instead, each worker thread builds up the entries it wishes to insert and once they have all finished the main thread extends the map with the result. To parallelise insertion into the final structure, consider partitioning the map (so that you essentially have a collection of maps)… ideally on the same basis as the parallelism so that each partition is exclusively managed on a dedicated thread. But that of course comes with its own overhead and cognitive costs. – eggyal Jun 30 '21 at 03:15
  • 1
    You may want to look at implementations of concurrent maps like `dashmap`. – user2722968 Jun 30 '21 at 15:45

0 Answers0