5

I am fairly new to matlab and I am trying to figure out when it is best to use cells, tables, or matrixes to store sets of data and then work with the data.

What I want is to store data that has multiple lines that include strings and numbers and then want to work with the numbers.

For example a line would look like

'string 1' , time, number1, number 2

. I know a matrix works best if al elements are numbers, but when I use a cell I keep having to convert the numbers or strings to a matrix in order to work with them. I am running matlab 2012 so maybe that is a part of the problem. Any help is appreciated. Thanks!

Amro
  • 123,847
  • 25
  • 243
  • 454
Austen Novis
  • 444
  • 1
  • 12
  • 30
  • FYI, [`table`](http://www.mathworks.com/help/matlab/ref/table.html) (and its predecessor [`dataset`](http://www.mathworks.com/help/stats/dataset-class.html) from the Statistics toolbox) is a class implemented underneath using cell-arrays, along with easier indexing, convenient functions, and nicer output display. You can read the code if you want `which table` – Amro Jul 17 '14 at 10:00
  • Do they explicitly need to be stored in the same variable? could you work with a matrix for each of the 4 columns? You could also try creating a matrix for each of the 4 columns and grouping them with cells – Mauvai Jul 17 '14 at 10:05
  • If I did that would I still have to use cell2mat to access the matrixes in each of the cells? – Austen Novis Jul 17 '14 at 11:09
  • @AustenNovis you don't need to use cell2mat to address a matrix in a cell. `test{1} = [1 2; 3 4];` `test{1}(2,2)` returns `4`. – sco1 Jul 17 '14 at 11:26

3 Answers3

2

Use a matrix when :

  • the tabular data has a uniform type (all are floating points like double, or integers like int32);
  • & either the amount of data is small, or is big and has static (predefined) size;
  • & you care about the speed of accessing data, or you need matrix operations performed on data, or some function requires the data organized as such.

Use a cell array when:

  • the tabular data has heterogeneous type (mixed element types, "jagged" arrays etc.);
  • | there's a lot of data and has dynamic size;
  • | you need only indexing the data numerically (no algebraic operations);
  • | a function requires the data as such.

Same argument for structs, only the indexing is by name, not by number.

Not sure about tables, I don't think is offered by the language itself; might be an UDT that I don't know of...

Later edit

These three types may be combined, in the sense that cell arrays and structs may have matrices and cell arrays and structs as elements (because thy're heterogeneous containers). In your case, you might have 2 approaches, depending on how you need to access the data:

  • if you access the data mostly by row, then an array of N structs (one struct per row) with 4 fields (one field per column) would be the most effective in terms of performance;

  • if you access the data mostly by column, then a single struct with 4 fields (one field per column) would do; first field would be a cell array of strings for the first column, second field would be a cell array of strings or a 1D matrix of doubles depending on how you want to store you dates, the rest of the fields are 1D matrices of doubles.

1

Concerning tables: I always used matrices or cell arrays until I had to do database related things such as joining datasets by a unique key; the only way I found to do this in was by using tables. It takes a while to get used to them and it's a bit annoying that some functions that work on cell arrays don't work on tables vice versa. MATLAB could have done a better job explaining when to use one or the other because it's not super clear from the documentation.

Johan
  • 863
  • 3
  • 13
  • 28
0

The situation that you describe, seems to be as follows:

You have several columns. Entire columns consist of 1 datatype each, and all columns have an equal number of rows.

This seems to match exactly with the recommended situation for using a [table][1]

T = table(var1,...,varN) creates a table from the input variables, var1,...,varN . Variables can be of different sizes and data types, but all variables must have the same number of rows.

Actually I don't have much experience with tables, but if you can't figure it out you can always switch to using 1 cell array for the first column, and a matrix for all others (in your example).

Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
  • I think tables were introduced in R2013B. I guess cells or structs are the best approaches. – Stewie Griffin Jul 17 '14 at 09:57
  • or `dataset` from the Statistics toolbox – Amro Jul 17 '14 at 10:02
  • As @Amro mentioned, the dataset can be used as well (sometimes you will even have to). But [it is recommended to use tables if you can](http://www.mathworks.com/matlabcentral/answers/103851-what-are-the-differences-between-dataset-and-table-in-matlab-8-2-r2013b). – Dennis Jaheruddin Jul 17 '14 at 10:20