2

I have a question regarding the timing when accessing / reassigning variables either from a matlab struct or a matlab variable (any array):

Imagine the scenario that you have one function that creates ten variables (arrays of different dimensions and sizes). This function is getting called within another function that will need these variables produced.

Now, because getting ten variables from a function looks messy, I thought about storing these ten variables in a struct instead, and change my initial function so that it outputs only one struct (with ten fields) instead of ten variables.

Because timing is crucial for me (it's code for an EEG experiment), I wanted to make sure, that the struct approach is not slower, so I wrote the following test function.

function test_timingStructs

%% define struct
strct.a=1; strct.b=2; strct.c=3;

%% define "loose" variables
a = 1; b = 2; c = 3;

%% How many runs?
runs = 1000;

%% time access to struct
x = []; % empty variable
tic
for i=1:runs
    x = strct.a; x = strct.b; x = strct.c;
end
t_struct = toc;

%% time access to "loose variables"
x = []; % empty variable
tic
for i=1:runs
    x = a; x = b; x = c;
end
t_loose = toc;

%% Plot results
close all; figure;
bar(cat(2,t_struct,t_loose));
set(gca,'xticklabel', {'struct', 'loose variable'})
xlabel('variable type accessed', 'fontsize', 12)
ylabel(sprintf('time (ms) needed for %d accesses to 3 different variables', runs), 'fontsize', 12)
title('Access timing: struct vs "loose" variables', 'fontsize', 15)

end

According to the results, accessing a structure to get the values of a field is considerably slower than just accessing a variable. Can I make this assumption based on the test that I have done?

Is there another way to neatly "package" the ten variables without losing time when I want to access them?

Results

Suever
  • 64,497
  • 14
  • 82
  • 101
S.A.
  • 1,819
  • 1
  • 24
  • 39
  • 2
    While the `struct` reference may be slower relative to variables due to the extra `subsref` needed, the maximum absolute time of your test is around 1 microsecond per iteration (on my sixe-year-old machine with a R2016a install). Without knowing more about your needs, I wouldn't guess it doesn't matter, but without a full test case, uncertainty is non-zero. – TroyHaskin Dec 03 '16 at 15:02
  • A single iteration of `tic` and `toc` is not a good benchmarking method. Either use multiple runs and average the result or use [`timeit`](https://www.mathworks.com/help/matlab/ref/timeit.html). This is also heavily dependent on MATLAB version and hardware. As an aside, if microsecond timing really isn't sufficient then I would recommend a different language. – sco1 Dec 03 '16 at 15:06

1 Answers1

3

In theory, yes, the access to data within a struct is going to be slower than access to data stored in a variable. This is just the overhead that the higher level datatype incurs.

BUT

In your tests, you are only measuring the access time of the data within the two data structures. When you are using variables, simply assigning one variable to another takes very little time because MATLAB uses copy-on-write and does not actually make a copy of the data in memory until it is modified.

As a result, the test that you have written isn't very useful for determining the actual cost of using a struct because I'm sure your function does something with the data that it receives. As soon as you modify the data, MATLAB will make a copy of the data and peform the requested operation. So to determine what the performance penalty of a struct is, you should time your actual function rather than the no-op function that you're using.

A slightly more realistic test

I have written a test below which compares the struct and variable access where the called function does and does not modify the data.

function timeaccess

    sz = round(linspace(1, 200, 100));

    [times1, times2, times3, times4] = deal(zeros(size(sz)));

    for k = 1:numel(sz)

        n = sz(k);

        S = struct('a', rand(n), 'b', rand(n), 'c', rand(n));
        times1(k) = timeit(@()access_struct(S));
        S = struct('a', rand(n), 'b', rand(n), 'c', rand(n));
        times2(k) = timeit(@()access_struct2(S));
        a = rand(n); b = rand(n); c = rand(n);
        times3(k) = timeit(@()access_vars(a, b, c));
        a = rand(n); b = rand(n); c = rand(n);
        times4(k) = timeit(@()access_vars2(a, b, c));
    end

    figure

    hax1 = subplot(1,2,1);
    ylabel('Execution Time (ms)')
    xlabel('Size of Variables');

    hold on

    plot(sz, times2 * 1000, 'DisplayName', 'Struct w/o modification')
    plot(sz, times4 * 1000, 'DisplayName', 'Variables w/o modification')

    legend(findall(hax1, 'type', 'line'))

    hax2 = subplot(1,2,2);
    ylabel('Execution Time (ms)')
    xlabel('Size of Variables');
    hold on

    plot(sz, times1 * 1000, 'DisplayName', 'Struct w modification')
    plot(sz, times3 * 1000, 'DisplayName', 'Variables w modification')

    legend(findall(hax2, 'type', 'line'))

    saveas(gcf, 'data_manipulation.png')
    legend()
end

function [a, b, c] = access_struct(S)
    a = S.a + 1;
    b = S.b + 2;
    c = S.c + 3;
end

function [a, b, c] = access_struct2(S)
    a = S.a;
    b = S.b;
    c = S.c;
end

function [d, e, f] = access_vars(a, b, c)
    d = a + 1;
    e = b + 1;
    f = c + 1;
end

function [d, e, f] = access_vars2(a, b, c)
    d = a;
    e = b;
    f = c;
end

The results

As you can see, the struct is slower for just assigning a variable to another variable, but as soon as I perform an operation (here I have the very simple operation of adding a constant to each variable), the effect of the access time is negligible.

enter image description here

Summary

Based on the test above, I would assume that the time difference between the two is going to be negligible for your use case. Even if the struct is a little slower, it may be a cleaner design and result in more readable / maintainable code and may be worth the difference in performance.

If you're very concerned about performance, it may be worth looking into a C/C++ mex function to do some of the heavy lifting for you or switch to a more performant language than MATLAB.

Suever
  • 64,497
  • 14
  • 82
  • 101