1

I am trying to understand whether data of SharedArray type is being moved across processes and therefore causing overhead.

After defining the variables in my Main module (process 1), I called pmap on Array (im1) and SharedArray (im1_shared) versions of my data.

pmap(someFunction(im1, im2, x), iterations)
pmap(someFunction(im1_shared, im2_shared, x), iterations) 

So im1, im2 and im1_shared, im2_shared are kind of default arguments and slices are being taken with an iterator x and processed by the workers.

Using

@fetchfrom 2 varinfo()

I get:

im1 122.070 MiB 4000×4000 Array{Float64,2}

im2 122.070 MiB 4000×4000 Array{Float64,2}

im1_shared 122.071 MiB 4000×4000 SharedArray{Float64,2}

im2_shared 122.071 MiB 4000×4000 SharedArray{Float64,2}

So my thoughts and confusions on this:

  1. During the pmap function call on the Array types, im1 and im2 were copied to all 3 additional workers. In the end, I have 4 copies of im1 and im2 in my memory.
  2. The workers (here worker 2) also list the SharedArrays. So I am thinking that either one of these could be partially correct:

    2.1. varinfo() lists all variables in the local workspace of the workers, but the SharedArrays are not stored in the local memory of those. They are just listed to imply that a worker has access to them.

    In this case, if SharedArrays are only stored once and all workers have access to them, why aren't these types default in Julia - to minimize overhead in the first place?

    2.2 The SharedArray was also copied to each worker, thus the 122 MiB for each SharedArray. So the only advantage of SharedArrays over Arrays is the access for every worker. The stored data has to be copied to each either way.

    In this case, the only way of avoiding overhead is by using something like a distributed array and let workers only operate on chunks they already have access to/have stored in their memory, right?

Could you please help me sort my mind on these two scenarios (2.1 and 2.2).

UPDATE 1: Here is a working example:

@everywhere using InteractiveUtils # to call varinfo() on all workers

### FUNCTIONS
@everywhere function foo(x::Array{Float64, 2}, y::Array{Float64, 2}, t::Int64)
    #just take a slice of both arrays at dfined steps and sum the values
    x_slice = x[t:t+5, t:t+5]
    y_slice = y[t:t+5, t:t+5]
    return x_slice + y_slice   
end

@everywhere function fooShared(x::SharedArray{Float64, 2}, y::SharedArray{Float64, 2}, t::Int64)
    #just take a slice of both arrays at dfined steps and sum the values
    x_slice = x[t:t+5, t:t+5]
    y_slice = y[t:t+5, t:t+5]
    return x_slice + y_slice   
end

### DATA
n = 1000
#the two Arrays
im1 = rand(1.0:2.0, n, n)
im2 = copy(im1);

#The two shared arrays
im1_shared = SharedArray(im1)
im2_shared = SharedArray(im2);

@fetchfrom 2 varinfo() # im1_shared and im2_shared are not yet listed, of course not...

pmap(x -> foo(im1, im2, x), [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
pmap(x -> fooShared(im1_shared, im2_shared, x), [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

@fetchfrom 2 varinfo() # im1_shared and im2_shared are now listed
duls91
  • 11
  • 2

1 Answers1

1

SharedArray is shared among many Julia processes via memory mapping (https://docs.julialang.org/en/v1/stdlib/Mmap/index.html). Data can be initialized in the the following way:

using Distributed
Distributed.addprocs(2);
@everywhere using SharedArrays
@everywhere function ff(ss::SharedArray)
   println(myid()," ",localindices(ss))
   for ind in localindices(ss)
       ss[ind] = rand(1.0:2.0)    
   end   
end

And now let us perform the actual initialization:

julia> s = SharedArray{Float64}((1000,1000),init=ff)
      From worker 2:    2 1:500000
      From worker 3:    3 500001:1000000
1000×1000 SharedArray{Float64,2}:
 2.0  1.0  1.0  1.0  1.0  2.0  …  2.0  1.0  2.0  2.0  1.0
 2.0  2.0  2.0  2.0  2.0  2.0     2.0  1.0  2.0  1.0  2.0
 2.0  1.0  1.0  2.0  1.0  2.0     1.0  1.0  1.0  1.0  2.0
 ⋮                        ⋮    ⋱  ⋮
 1.0  1.0  1.0  1.0  1.0  2.0     2.0  2.0  1.0  1.0  1.0
 1.0  2.0  1.0  2.0  2.0  1.0     2.0  2.0  1.0  1.0  1.0
 2.0  2.0  1.0  2.0  1.0  2.0     2.0  1.0  1.0  2.0  2.0

You can see that each worker initialized a separate part of the array that it works on.

Przemyslaw Szufel
  • 40,002
  • 3
  • 32
  • 62
  • Thanks for the reply. I want to take smaller regions of 2 images and send them to a worker, where a cross-corrleation between the small regions will be performed. So from your explanation, I have to keep in mind that one of the regions is already "stored" on the worker, while the other one, is not. So it will be sent to the worker (cross-communication) and then a correlation is performed. I would need to keep track on which parts of the images im1 abd im2 are on which worker and let the cross correlation be performed there. Did I get this straight? – duls91 Dec 14 '18 at 23:54
  • Also, where is the distinction to a Distributed array. I thought that this one is stored in the way you described for shared arrays, every process has a chunk of the data? – duls91 Dec 14 '18 at 23:56
  • You are right, I was thinking about `DistributedArray`s and wrote about `SharedArray`s. SharedArray access the same memory via multiple processes. So there is no cross-communication as long as you are on a single host. – Przemyslaw Szufel Dec 15 '18 at 00:52
  • Just to solve my confusion, a SharedArray is stored only once and all processes have access to it. So if we want to use parallel processing on one machine, we store everything in a SharedArray form and there is no overhead. No need to worry about cross-communication anymore? Can everything be stored in shared memory i.e. if I have 8 GB of RAM, do I have also 8 GB of capacity storing shared objects like these SharedArrays? – duls91 Dec 15 '18 at 00:59
  • Yes - exactly as you wrote - the same memory will be accessible by many processes and only a single memory allocation will be done. I have just checked that `varinfo()` is lying here (i.e. calculates the reference to memory each time). Maybe you could edit your question to ask only about this and then I edit my answer? – Przemyslaw Szufel Dec 15 '18 at 01:47
  • I think your answer would still fit. So your are pointing to my 2.1). Maybe you can summarize the shared arrays and also add the comment on varinfo(). I am just wondering, why this is not default. Always store in shared memory so all processes are fine and no overhead problems. But anyway, I can ignore varinfo() and continue with more certainty that the data is not stored multiple times across workers. – duls91 Dec 15 '18 at 02:01