2

Let's say I have a table like this:

post   user         date
____   ____   ________________
 1      A     12.01.2014 13:05
 2      B     15.01.2014 20:17
 3      A     16.01.2014 05:22

I want to create a smaller table (but not delete the original one!) containing all posts of - for example - user A including the dates that those were posted on.

When looking at MATLAB's documentation (see the very last part for deleting rows) I discovered that MATLAB allows you to create a mask for a table based on some criterion. So in my case if I do something like this:

postsA = myTable.user == 'A'

I get a nice mask vector as follows:

>> postsA = 
       1
       0
       1

where the 1s are obviously those rows in myTable, which satisfy the rule I have given.

In the documention I have pointed at above rows are deleted from the original table:

postsNotA = myTable.user ~= 'A' % note that I have to reverse the criterion since I'm choosing stuff that will be removed
myTable(postsNotA,:) = [];

I would however - as stated above - like to not touch my original table. One possible solution here is to create an empty table with two columns:

post  date
____  ____

interate through all rows of my original table, while also looking at the current value of my mask vector postsA and if it's equal to 1, copy the two of the columns in that row that I'm interested in and concatenate this shrunk row to my smaller table. What I'd like to know is if there is a more or less 1-2 lines long solution for this problem?

rbaleksandar
  • 8,713
  • 7
  • 76
  • 161
  • 1
    How about `table(t1.post(strcmp(t1.user,'A')),t1.date(strcmp(t1.user,'A')))`, where `t1` is your original table? – Divakar Dec 04 '14 at 17:35
  • This returns an empty table. Since my dataset is huge I re-created the same table as in my original post in order to test your way. Have you tested it? If so, can you please post your code so that I can compare. – rbaleksandar Dec 04 '14 at 18:08

1 Answers1

6

Assuming myTable is your original table.

You can just do

myTable(myTable.user == 'A',:)

Sample Code:

user = ['A';'B';'A';'C';'B'];
Age = [38;43;38;40;49];
Height = [71;69;64;67;64];
Weight = [176;163;131;133;119];
BloodPressure = [124 93; 109 77; 125 83; 117 75; 122 80];
T = table(user,Age,Height,Weight,BloodPressure)
T(T.user=='A',:)

Gives:

T =

user    Age    Height    Weight          BloodPressure      
____    ___    ______    ______    _________________________

A       38     71        176       124                    93
B       43     69        163       109                    77
A       38     64        131       125                    83
C       40     67        133       117                    75
B       49     64        119       122                    80

ans =

user    Age    Height    Weight          BloodPressure      
____    ___    ______    ______    _________________________

A       38     71        176       124                    93
A       38     64        131       125                    83
Nitish
  • 6,358
  • 1
  • 15
  • 15
  • This is a better (and in my case also working) solution however it contains the "user" column, which I do not want to store, since - as I've mentioned in my original post - the table is going to be stored in a structure for that specific user (in your case - patient), which means that all the data in the extracted table is about him/her anyway. I wanted to do this in order to save resources (talking about approx. 1 000 000 entries in my data set). I presume you can use a range in the column-parameter of the indexing of the table.Is it possible to use variable names of columns for this purpose? – rbaleksandar Dec 04 '14 at 18:09
  • For example T(T.user=='A',:2) will return all the Age-entries for patient A. One can give a range with numbers but I think in my case it would be easier to read and handle if I simply list, which columns I want to extract (similar to what @Divakar has posted as comment) – rbaleksandar Dec 04 '14 at 18:16