1

I need to create a known number of dynamically growing vectors of integers. The number of the vectors are known (N = 10^7) as well as the maximum size of any of those vectors (M = 2*10^5).

My naive idea was:

let list: Vec<Vec<i32>> = vec![Vec::new(); N];

This works but takes ~3s on an i7 2.8 GHz. I've looked into arrays, which generally have 2x performance over vectors:

// Does not compile!
let list: [Vec<i32>; N] = [Vec::new(); N];

This does not compile as Vec is not copyable. I wasn't able to wrap Vec with a struct and implement Copy on it.

I could use an array of arrays ([[i32; M]; N]) but this would allocate way too much memory.

How can I create a list of dynamically sized arrays that performs well?

For comparison, the same in C++ performs under a second:

std::vector<std::vector<int32_t>> list(N, std::vector<int32_t>());

I was thinking of using a linked list instead of vectors, but I feel there is an elegant solution for this problem.

Update: as @Shepmaster pointed out correctly, compiling with --release should be the base of comparison. This way cargo build --release provides ~45x speed increase, being twice as fast as clang with -O3.

itarato
  • 795
  • 1
  • 8
  • 24
  • @Stargateur - Unfortunately that would exceed the memory limit. (That's why I was also considering linked lists.) – itarato Jul 08 '17 at 16:53
  • Same: `fatal runtime error: out of memory` – itarato Jul 08 '17 at 16:58
  • 1
    *performs under a second* — how much under a second? Instantly? I wonder if C++ is deferring some work until later when the vector is actually used, while Rust is being more proactive. If so, then the time difference might be moot as the C++ would take longer later. – Shepmaster Jul 08 '17 at 16:58
  • @Shepmaster - yes, I was aware of that effect, so in my example I filled up the whole array and executed the algorithm for my problem - the timing was the partial timing for the allocation - the whole algorithm as well finished in less than 3s in C++. – itarato Jul 08 '17 at 17:00
  • 1
    @Stargateur note that a `Vec` with 2*10^12 entries is 8 TB (7.2TiB). Most people do not have a chunk of RAM that big. – Shepmaster Jul 08 '17 at 17:02
  • 3
    The dumb question that has to be asked because you didn't state it explicitly: are you compiling and running the Rust code with `--release`? – Shepmaster Jul 08 '17 at 17:05
  • @Shepmaster - no, this needs to run on Hackerrank so I tried to keep everything "basic". – itarato Jul 08 '17 at 17:21
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148691/discussion-between-itarato-and-shepmaster). – itarato Jul 08 '17 at 17:24
  • 2
    Checking the assembly, there is no reason for Rust to be slower than C++. Actually, if anything, it seems to me Rust should be faster (at least with Clang, C++ seems to be calling `new` for each inner vector). In release mode of course, because there's little point using C++ or Rust in Debug mode. – Matthieu M. Jul 08 '17 at 17:33
  • 1
    I assume your matrix will be extremely sparse since the full allocation of 10**7*2*10**5*4 bytes is probably not within your budget constraint. If adjacent element placement is not required, you may get away with simply using HashMaps of HashMaps and the array position as keys. You'll pay more in terms of actual memory consumed and slower lookups for individual elements, but if you matrix is truly sparse it may not matter. – user2722968 Jul 08 '17 at 19:44

0 Answers0