5

Background

I am trying to model a system that can change its configurations each time step. The variety of configurations is known in advance and does not depend on the time step. Transitions are allowed between certain configurations and forbidden between others. The objective is to build an adjacency connectivity matrix of allowed transitions, which spans multiple time steps.

Setting

Let A be an s*s*k logical matrix representing allowed transitions, and A1...Ak represent the pages/slices of A:

A1 = A(:,:,1); A2 = A(:,:,2); ... Ak = A(:,:,k);

The meaning of the 3rd dimension is how many time steps are required for a transition, for example: if A(1,3,2) is nonzero, it means that state #1 can transition to state #3 and this will take 2 time steps.

Let B be the adjacency matrix that we want to build, which represents nt time steps. The shape of B should be schematically (in block matrix notation):

     _                                   _
    | [0] [A1] [A2] ... [Ak] [0]  ... [0] |
B = | [0] [0]  [A1] [A2] ... [Ak] ... [0] |
    |  ⋮    ⋮     ⋱    ⋱      ⋱       ⋮  |
    |_[0] [0]  …  …  …  …  …  …  …  … [0]_| "[A1] [A2] ... [Ak]"

where the main block diagonal consists of nt 0-blocks, and the slices of A get gradually "pushed" to the right until "time runs out", the slices of A end up "outside" of B ⇒ indicating no more transitions are possible. Since B consists of nt*nt s*s blocks, its size is (nt*s)×(nt*s).

Question: Given A and nt, how can we construct B in the most CPU- and memory-efficient way?

Notes

  • Since B is mostly filled with zeros, it probably makes sense for it to be sparse.
  • CPU efficiency (runtime) is more important in my application than memory efficiency.
  • In the real problem, s=250 and nt=6000.
  • External scripts/classes/tools are welcome.
  • An idea I had was not constructing the matrix staggered initially, but instead having a main diagonal of [A1] blocks and circshift-ing and masking when everything else is done.

Demonstration + Naïve implementation

s = 3; k = 4; nt = 8;
A = logical(cat(3, triu(ones(s)), eye(s), zeros(s), [0 0 0; 0 0 0; 0 1 0]));
% Unwrap A (reshape into 2D):
Auw = reshape(A, s, []);
% Preallocate a somewhat larger B:
B = false(nt*s, (nt+k)*s);
% Assign Auw into B in a staggered fashion:
for it = 1:nt
  B( (it-1)*s+1:it*s, it*s+1:(it+k)*s ) = Auw;
end
% Truncate the extra elements of B (from the right)
B = B(1:nt*s, 1:nt*s);
spy(B);

Resulting in:

enter image description here

Dev-iL
  • 23,742
  • 7
  • 57
  • 99
  • 2
    I have a vectorized solution that takes 10x more time :-D – Luis Mendo Jul 30 '20 at 12:26
  • It is often more efficient for graph algorithms with sparse graphs like this to use a representation where each node has an array of neighboring node IDs (indices into the array of nodes). It becomes much faster to visit all neighbors of a node. – Cris Luengo Jul 30 '20 at 14:02
  • @Cris We are planning to use `digraph` after the transition matrix is done. That function accepts both input types (i.e., either a list of transitions or an adjacency matrix), and it is quite likely that they are converted to the same representation internally, early on. I personally think that building a matrix is easier, though the conversion between these systems is quite straightforward. I am sure future readers could benefit from a solution like you suggest, should you decide to post one :) – Dev-iL Jul 30 '20 at 14:28

1 Answers1

2

One solution could be to calculate all the indices using implicit expansion:

% Dev-iL minimal example
s = 3; k = 4; nt = 8;
A = logical(cat(3, triu(ones(s)), eye(s), zeros(s), [0 0 0; 0 0 0; 0 1 0]));
Auw = reshape(A, s, []);

% Compute the indice:
[x,y] = find(Auw);
x = reshape(x+[0:s:s*(nt-1)],[],1);
y = reshape(y+[s:s:s*nt],[],1);

% Detection of the unneeded non zero elements:
ind = x<=s*nt & y<=s*nt;

% Sparse matrix creation:
S = sparse(x(ind),y(ind),1,s*nt,s*nt);

% Plot the results:
spy(S)

Here we only compute the position of non zero values. We avoid to preallocate a big matrix that will slow down the computation.

Benchmark:

I have used matlab online to run the benchmark, the available memory is limited. If someone would run the benchmark on his local computer with bigger value, feel free to do so.

enter image description here

With those configurations, it seems that using implicit expansion is indeed faster.

Benchmark code:

for ii = 1:100
    s   = ii; k = 4; nt = ii;
    Auw = rand(s,s*k)>0.75;

    f_expa = @() func_expansion(s,nt,Auw);
    f_loop = @() func_loop(s,k,nt,Auw);

    t_expa(ii) = timeit(f_expa);
    t_loop(ii) = timeit(f_loop);
end

plot(1:100,t_expa,1:100,t_loop)
legend('Implicit expansion','For loop')
ylabel('Runtime (s)')
xlabel('x and nt value')

% obchardon suggestion
function S = func_expansion(s,nt,Auw)
    [x,y] = find(Auw);
    x = reshape(x+[0:s:s*(nt-1)],[],1);
    y = reshape(y+[s:s:s*nt],[],1);
    ind = x<=s*nt & y<=s*nt;
    S = sparse(x(ind),y(ind),1,s*nt,s*nt);
end

% Dev-il suggestion
function B = func_loop(s,k,nt,Auw)
    B = false(nt*s, (nt+k)*s);
    for it = 1:nt
        B( (it-1)*s+1:it*s, it*s+1:(it+k)*s ) = Auw;
    end
    B = B(1:nt*s, 1:nt*s);
end
obchardon
  • 10,614
  • 1
  • 17
  • 33
  • 1
    Thank you, I've edited my answer, I've also added a benchmark. – obchardon Aug 04 '20 at 08:52
  • You're welcome :) There's a bit of an unfair advantage to the `for` approach, as it works with `logical` values (`B` is preallocated as `false(...)`), whereas in your approach you assign `1` (`double`) instead of `true` (`logical`). This only means that the potential performance of your method are even better... – Dev-iL Aug 04 '20 at 08:55