4

I've just started with MATLAB, having mainly played around with Python previously. I've just made my first associative array, and am a little confused with how it's dealing with commas and spaces. My array is:

co_comma=containers.Map({'Open University','UCL',' University of Edinburgh','Birkbeck'},{193835,21210,24525,17822})

I also made a second associative array, splitting using spaces:

co_space=containers.Map({'Open University' 'UCL' ' University of Edinburgh' 'Birkbeck'},{193835 21210 24525 17822})

They both give the following:

Map with properties:

    Count: 4
  KeyType: char
ValueType: double

But co_comma==co_space gives False:

ans =

  logical

   0

Questions:

  1. how are these associative arrays different
  2. what actually is a container? Although I've never thought of lists etc in this way, Python seems to have containers in the form of lists/tuples/general iterables - https://stackoverflow.com/a/11576019/11357695. So is a matlab string container vs a matlab char array different in the same way as (for example) python lists and python tuples are different?

Thanks :D

Tim Kirkwood
  • 598
  • 2
  • 7
  • 18

2 Answers2

4

Many things mixed in here!

Clarifications:

  • A matlab container is only equivalent to a python dictionary, not a list/tuple/general iterable.

  • Both of your containers are created the same. You seem to be naming them comma and space, but this distinction does not even reach the definition of the container. Both {'Open University','UCL',' University of Edinburgh','Birkbeck'} and {'Open University' 'UCL' ' University of Edinburgh' 'Birkbeck'} create the exact same cell array, so the input to container.map is the same. you are comparing a=[5]; and b=5, as a MATLAB-valid analogy. They are the same.

  • For objects, most programing languages (including python!) will give you false when you compare two objects that contain the same values, yet are different objects. Only basic variables tend to compare by value, and not by some sort of objecID. In your case, simply doing isequal(co_comma,co_space) will return true, so their values are the same (as we already know, from the previous point)

  • Containers are not generally used in MATLAB, unless you specifically want a dictionary.

Ander Biguri
  • 35,140
  • 11
  • 74
  • 120
  • Thanks for your response! To clarify the second bullet point, in Python if I compared `[5]` with `5` this would be False, because one is a `list` and one is an `int` - the values are the same but the object type is different, so they are not the same. I assume the same thing is not happening here though, because you say the arrays are the same, not just the values? – Tim Kirkwood May 23 '21 at 15:04
  • For your second point, I may be misunderstanding but if i say `a=3` and `b=3`, `a==b` gives `True`. Are these not different objects that should thus return `False` as they "contain the same values, yet are different objects"? They are stored in separate bits of memory if I understand variable assignment correctly, so are seperate objects. Thanks for reading, I've probably managed to mix up more things X) – Tim Kirkwood May 23 '21 at 15:04
  • As in any other languages, there are some native types that are not objects. This is all covered in a good basic OOP tutorial,I suggest one of those. I am not using object as a general arbitrary word, objects are a very specific thing in programming. Native types are not objects often. – Ander Biguri May 23 '21 at 19:01
  • 2
    In Matlab, the "object equality" situation is a little different: It is only `handle` objects that use `==` to test for equality by object identity. For regular (non-handle) objects, `==` is just another overrideable operator (implemented by the `eq` method). It's common for objects to define `==` to do equality by value, but the default case is that `==` is not defined at all for objects, and attempting to do it will just give you a "Operator '==' is not supported" error. `isequal()` is the operation that's defined by default, and it compares by value. – Andrew Janke May 23 '21 at 19:22
  • 1
    "As in any other languages, there are some native types that are not objects." <- This is no longer the case in Matlab. In recent versions, the type system has been unified, and primitives like `double` are considered to be classes. You can even inherit from them. They're just not _user-defined_ "classdef" or "MCOS" classes. Same with many other languages where "everything is an object". Only some programming languages have a notion of non-object primitives as a distinct thing. – Andrew Janke May 23 '21 at 19:25
  • @AndrewJanke agreed, was doing a bit of hand waving here, but your descriptions (and answer itself) is much more accurate :) Thanks for that. – Ander Biguri May 23 '21 at 20:39
4

So here's the deal with Matlab: Matlab is an oddball. In Matlab, everything is an array, and there is no real distinction between regular "values" and "containers" or "collections" like there is in many programming languages. In Matlab, any numeric value is a numeric array. Any string value is actually an array of strings. Every value is iterable and can be used in a for loop or other "list"-like context. And every type or class in Matlab must implement collection/container behaviors as well as its "plain" value semantics.

Scalar values in Matlab are actually degenerate cases of two-dimensional arrays that are 1-by-1 in size. The number 42? Actually a 1-long array of doubles. The string "foo"? Actually a 1-long array of strings. Everything in Matlab is actually like a list in Python (or, more accurately, a NumPy series or array).

A cell array is an array of things, which can contain any other type of array in each of its elements. This is used for heterogeneous collections.

A char array is a bit weird, because it is used to represent strings, but its elements are not themselves strings, but rather characters. Plain char arrays are used in a weird way to represent lists of strings: you make a 2-D char array, and read each row as a string that is padded with spaces on the right. Lists of strings that are different lengths used to be represented as "cellstrs", which are cell arrays that contain char row vectors in each element. It's weird; see my blog post about it. The new string array makes most uses of other string types obsolete. You might want to use strings here. In Matlab, string literals are double-quoted, and char literals are single-quoted.

The word "container" doesn't have a specific technical meaning in Matlab, and is not really a thing. There's the containers package, but the only thing in it is containers.Map; there are no other types of containers and no generalization of what it means to be a container. One of the Matlab developers had an idea for making containers like this, but it never really went anywhere. And as far as I can tell, containers.Map hardly gets used at all: containers.Map is a pass-by-reference "handle" object, whereas most Matlab types are pass-by-value, so Map can't be used easily with most Matlab code.

So, putting aside the weirdness of chars, everything in Matlab has array semantics, and is effectively an iterable container like Python lists or tuples. And in Matlab, most values and objects are immutable and pass-by-value, so they are more like Python tuples than Python lists.

Andrew Janke
  • 23,508
  • 5
  • 56
  • 85
  • 2
    Nice addition. Except for the “weirdness of the char array” comment. A char array is an array of chars. It works just like any other array. Not sure why you think it’s weird. – Cris Luengo May 23 '21 at 14:47
  • In and of itself it is not weird. But its conventional _usage_ to represent lists of strings is weird: An m-by-n char array doesn't represent m*n strings. Rather, it is m strings, each of which may be n chars or less. (Plus the special exception that if m=0, then it represents 1 string and not 0 strings.) So use of numel(), length(), etc do not give you the right answers for "how many strings are in here?", ==, <, and > can't be used, sort() and unique() need special calling forms, etc. And while Matlab arrays are column-major, the layout of strings in a 2-D char array is row-major. Etc etc. – Andrew Janke May 23 '21 at 19:14
  • The upshot is that char arrays cannot be used in a polymorphic or generic manner; you always have to add special case code to handle them differently from just about any other Matlab type. – Andrew Janke May 23 '21 at 19:26
  • And the trailing-space situation. Consider the char arrays `a = 'foo'; b = 'foo ';`. `isequal(a, b)` is false. `strcmp(a, b)` is false. `strcmp(strcat(a,'x'), strcat(b,'x'))` is _true_. `isequal(cellstr(a), cellstr(b))` is true. `isequal(cellstr({a}), cellstr({b}))` is false. `isequal(b, char(cellstr(b)))` is false. `b == b` is true. `a == b` is an error. It's confusing! – Andrew Janke May 23 '21 at 19:51
  • 2
    I agree that using a char array to store strings in its rows is weird. But this hasn’t been the norm since the introduction of cell arrays in MATLAB 5 (or maybe earlier?), somewhere in the mid 1990’s. I will admit I don’t ever use char arrays for anything else than storing a single string. But if you were to use them as 2D or higer-D arrays of chars, you can certainly apply arithmetic and comparison operators to them. Just not to compare strings, to compare chars. – Cris Luengo May 23 '21 at 20:23
  • 1
    Yeah, 2D chars is a little archaic, though I was seeing them as late as like 2008, and they'll still show up in a few data-import circumstances I think. Stick with 1-row chars, cellstrs, and the new `string` array and you mostly won't have to worry about it. And the new `string` array type really takes care of all these issues. – Andrew Janke May 24 '21 at 14:25