Matlab: associating an ID with a dataset (e.g. struct)?

Question

I am developing a certain feature for a high-order finite element simulation algorithm in Matlab and I am wondering what is a good way of implementing a certain task. I believe I am facing a somewhat common problem, but after doing some digging, I'm not really finding a good solution.

Basically, I have a long list of ID's (corresponding to certain nodes on my mesh), where each ID is associated with small data set. Then when I am running my solver, I need to access the data associated with these nodes and update the data (multiple times).

So, for example, let's say that this is my list of these specific nodes:

nodelist = [3 27 38] %(so these are my node ID's)

Then for each node I have the following dataset associated

a (scalar)
b (5x5 double matrix)
c (10x1 double vector)

(a total of 36 double values associated with each node ID)

In reality, I will of course have a much, much longer list of node ID's and a somewhat larger data set associated with each node (but still only double scalars, matrices and vectors (no characters, strings etc)).

Approach 1

So one approach I cooked up is just to store everything in a 2D double matrix, and then do some relatively complex indexing to access my data when needed. For the example above, the size of my 2D matrix would be

size(2Dmat) = [length(nodelist), 36]

Say I wanted to access b(3,3) for node ID 27, I would access 2Dmat(2,14).

In principle, this works, but the code is just not very clean and readable because of this complex indexing (not to mention, when I change something in the way the data set is set up, I need to re-adjust the whole indexing code).

Approach 2

Another approach would be to use some sort of struct for each node in the node list:

a = 4.4;
b = rand(5,5);
c = rand(10,1);
s = struct('a',a,'b',b,'c',c)

And then I can access the data via, e.g., s.b(3,3) etc. But I just don't know how to associate a struct with the node ID?

Approach 3

The last thing I could think of would be to set up some sort of SQL database, but this seems like an overkill. And besides, I need my code to be as fast as possible, since I need to access these fields in the datasets associated with these chosen nodes many, many times and I imagine doing some queries into a database will slow things down.

Note that ultimately I will convert the code from Matlab to C/C++, so I would prefer to implement something that doesn't rely to heavily on some Matlab specific features.

So, any thoughts on how to implement this functionality in a clean way? I hope my question makes sense and thanks in advance!

Although now that I think about it, perhaps a Map with the ID as the key and the index into the struct array as the value would be better as it would allow faster access. — beaker, Apr 14 '19 at 19:12
Can you use a Map of Lists? The key is the list name: https://www.mathworks.com/help/matlab/map-containers.html — duffymo, Apr 16 '19 at 11:14

Cris Luengo · Accepted Answer · 2019-04-15T13:50:56.887

2

Approach 2 is the cleanest, and readily translates to C++. For each node you have a struct s, then:

data(nodeID) = s;

is what is called a struct array. You index as

data(id).b(3,3) = 0.0;

This assumes that the IDs are contiguous, or that there are no huge gaps in their values. But this can always be ensured, it is easy to renumber node IDs if necessary.

In C++, you’d have a vector of structs:

struct Bla{
   double a;
   double b[3][3];
   double c[10];
};

std::vector<Bla> data(N);

Or in C:

Bla* data = malloc(sizeof(Bla)*N);

(and don’t forget free(data) when you’re done with it).

Then, in either C or C++, you access an element this way:

data[id].b[2][2] = 0.0;

The translation is obvious, except that indexing starts at 0 in C++ and at 1 in MATLAB.

Note that this method has a larger memory overhead than Approach 1 in MATLAB, but not in C or C++.

Approach 3 is a bad idea, it will just slow down your code without any benefits.

edited Apr 15 '19 at 13:50

answered Apr 14 '19 at 21:22

Cris Luengo

55,762
10
62
120

1

I was assuming that MATLAB would allocate memory for all 38 elements for IDs `[3 27 38]`, but that doesn't seem to be the case. – beaker Apr 14 '19 at 22:31
1

@beaker: it does allocate struct array elements for them, but if you don’t assign anything to a field, that field doesn’t occupy memory. Anyway, I assume node IDs are consecutive in any sensical application, it is trivial to renumber them if not. – Cris Luengo Apr 14 '19 at 23:47
Many thanks Cris, this works like a charm, exactly what I needed. Can I ask you, is it possible to do something similar in C, some sort of an array of structs? Up until now I have been working with Matlab and C, since I'm more familiar with C than C++. (I kind of always assumed that I would have to switch over from C to C++ at some point, but have managed to avoid that up until now...) – Finnur Pind Apr 15 '19 at 12:09
2

@FinnurPind: I’ve edited the answer with the C equivalent of `std::vector`. I’ve moved from C to C++ 4 years ago and don’t ever want to go back. C++ is so much easier and quicker to write in. I can not recommend it enough. But do learn C++11, forget about anything that came before. C++11 is a different language to what it was before. – Cris Luengo Apr 15 '19 at 13:52
Thank you @CrisLuengo for the edit and for your insights. I should clearly make the leap from C to C++! – Finnur Pind Apr 15 '19 at 18:08

score 2 · Answer 2 · answered Apr 15 '19 at 16:25

I think the cleanest solution, given a non-contiguous set of node IDs, would be approach 2 making use of a map container where your node ID is the key (i.e. index) into the map. This can be implemented in MATLAB using a containers.Map object, and in C++ using the std::map container. For example, here's how you can create and add values to a node map in MATLAB:

>> nodeMap = containers.Map('KeyType', 'double', 'ValueType', 'any');
>> nodelist = [3 27 38];
>> nodeMap(nodelist(1)) = struct('a', 4.4, 'b', rand(5, 5), 'c', rand(10, 1));
>> nodeMap(3)

ans = 

  struct with fields:

    a: 4.400000000000000
    b: [5×5 double]
    c: [10×1 double]

>> nodeMap(3).b(3,3)

ans =

   0.646313010111265

In C++, you would need to define a structure or class (e.g. Node) for the data type to be stored in the map. Here's an example (... denotes arguments passed to the Node constructor):

#include <map>

class Node {...};   // Define Node class
typedef std::map<int, Node> NodeMap;  // Using int for key type

int main()
{
  NodeMap map1;

  map1[3] = Node(...);  // Initialize and assign Node object
  map1.emplace(27, std::forward_as_tuple<...>);  // Create Node object in-place
}

Good solution if IDs have to be arbitrary numbers (don't even need to be integers in this case). But if it's possible to define IDs such that they are consecutive or nearly-consecutive, then you can avoid the overhead (time and space) of the map. — Cris Luengo, Apr 15 '19 at 16:43

Matlab: associating an ID with a dataset (e.g. struct)?

2 Answers2