I think my question is somewhat similar to this one. cbind is changing the values of the vector I am using (or using references to the values) I am basically getting data from a data frame and then organizing them in columns according to a certain factor (interface type). I think it has something to do with the levels, there, but I am not sure what those even mean right now. Here is what I ma doing and the results I am getting:
#Grouping subjects number of collisions data according to the interface they used
> ui1NumCollisions = dout$numCollisions[ dout$Interface=="0"]
> ui2NumCollisions = dout$numCollisions[ dout$Interface=="1"]
> ui3NumCollisions = dout$numCollisions[ dout$Interface=="2"]
> ui4NumCollisions = dout$numCollisions[ dout$Interface=="3"]
#checking data
> ui1NumCollisions
[1] 43, 30, 37, 6, 22, 9, 19, 9, 14, 106, 50, 53,
33 Levels: -1, 10, 106, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 21, 22, ... 9,
> ui2NumCollisions
[1] 17, 16, 23, 12, 15, -1, 11, 26, 19, 32, 36, 13,
33 Levels: -1, 10, 106, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 21, 22, ... 9,
> ui3NumCollisions
[1] 17, 38, 16, 13, 42, 50, 10, 17, 2, 28, 14, 30,
33 Levels: -1, 10, 106, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 21, 22, ... 9,
> ui4NumCollisions
[1] 42, 28, 22, 36, 10, 25, 45, 48, 18, 11, 21, 7,
33 Levels: -1, 10, 106, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 21, 22, ... 9,
#Creates matrix with each column containing collision data for each interface
#(I think)
> uiNumCollisions = cbind( '1' = ui1NumCollisions
+ , '2' = ui2NumCollisions
+ , '3' = ui3NumCollisions
+ , '4' = ui4NumCollisions)
#checking matrix values
> uiNumCollisions
1 2 3 4
[1,] 26 10 10 25
[2,] 20 9 24 19
[3,] 23 16 9 15
[4,] 31 5 6 22
[5,] 15 8 25 2
[6,] 33 1 29 17
[7,] 12 4 2 27
[8,] 33 18 10 28
[9,] 7 12 13 11
[10,] 3 21 19 4
[11,] 29 22 7 14
[12,] 30 6 20 32
> uiNumCollisionsSummary = summary(uiNumCollisions)
> uiNumCollisionsSummary
1 2 3 4
Min. : 3.00 Min. : 1.00 Min. : 2.0 Min. : 2.00
1st Qu.:14.25 1st Qu.: 5.75 1st Qu.: 8.5 1st Qu.:13.25
Median :24.50 Median : 9.50 Median :11.5 Median :18.00
Mean :21.83 Mean :11.00 Mean :14.5 Mean :18.00
3rd Qu.:30.25 3rd Qu.:16.50 3rd Qu.:21.0 3rd Qu.:25.50
Max. :33.00 Max. :22.00 Max. :29.0 Max. :32.00
Notice that 106 is not part of column 1, nor is it the maximum value there, but instead 33. So, why are the values in uiNumCollisions different from the individual columns (ui1NumCollisions, ui2NumCollisions, etc.)? It seems like I am getting the indices of the values from levels table. What I really wanted were the values themselves. This should have a simple answer I assume. I looked at a bunch of problems related to data binding, but could not figure out a solution to this problem using what I have found. What am I missing here?
I thank in advance for the help. Sincerely,
Paulo.
/-------FOLLOW - UP based on reply from DWin-------
Thanks for the reply. The solution of applying the data.frame to uiNumCollisions worked in getting the right data in there. However, when I apply the summary function:
uiNumCollisionsSummary = summary(uiNumCollisions)
I no longer get the statistics I used to (mean, median, etc.). Why is that?
In addition, after that, I want to apply a boxplot to uiNumCollisions and the an anova. For the boxplot, what I use is the following:
par( fig=c(0.0,1.0,0.0,1.0))
temp = boxplot( uiNumCollisions)
The result I get for the boxplot is
"Error in oldClass(stats) <- cl : adding class "factor" to an invalid object"
For the ANOVA I was using the following code:
temp = c(ui1NumCollisions, ui2NumCollisions, ui3NumCollisions, ui4NumCollisions)
temp.type = rep(c("1", "2", "3", "4"), c(12,12,12,12))
temp.type = factor(temp.type)
options(contrasts = c("contr.helmert", "contr.poly"))
uiNumCollisionsAOV = aov(temp ~ temp.type)
summary(uiNumCollisionsAOV)
However, this obviously will not work unless I convert each column to something else. I tried different fixes, like reapplying factors to each column
(e.g.: ui1NumCollisions = factor(ui1NumCollisions))
. That fixed the factor levels, but when I went to convert back to numeric values using something like as.numeric(levels(ui1NumCollisions)[ui1NumCollisions])
, I only got NAs. Hence,indeed, your solution worked and I really appreciate it, but it does not completely resolve my problem. Is there an easies around? Perhaps to simply import the dout table in a way I can get all the data without the factors that could then resolve all the factor issues I am having?
/-------FOLLOW - UP #2-------
I finally found what the problem was. There were commas between data instead of simply spaces. The file, data.out looked like this:
Subject, uiType, numCollisions, startTimeTraining, startTime, endTime, detlaTraining, deltaTask
0, 0, 43, 0, 510.261, 1743.75, 510.261, 1233.49
1, 1, 17, 0, 1198.65, 2044.62, 1198.65, 845.965
2, 2, 17, 0, 445.788, 1622.83, 445.788, 1177.04
3, 3, 42, 0, 254.793, 1196.93, 254.793, 942.132
4, 1, 16, 0, 1583.5, 2887.39, 1583.5, 1303.9
5, 2, 38, 0, 79.095, 886.533, 79.095, 1287.438
6, 3, 28, 0, 866.75, 1617.48, 866.75, 750.73
7, 1, 23, 0, 565.575, 1361.79, 565.575, 796.216
8, 2, 16, 0, 1211.99, 2538.37, 1211.99, 1326.38
...
And it was supposed to look like this.
Subject uiType numCollisions startTimeTraining startTime endTime detlaTraining deltaTask
0 0 43 0 510.261 1743.75 510.261 1233.49
1 1 17 0 1198.65 2044.62 1198.65 845.965
2 2 17 0 445.788 1622.83 445.788 1177.04
3 3 42 0 254.793 1196.93 254.793 942.132
4 1 16 0 1583.5 2887.39 1583.5 1303.9
5 2 38 0 79.095 886.533 79.095 1287.438
6 3 28 0 866.75 1617.48 866.75 750.73
7 1 23 0 565.575 1361.79 565.575 796.216
8 2 16 0 1211.99 2538.37 1211.99 1326.38
...
When I loaded the data table using these lines:
numSamples = 8#or more
dout = read.table("data.out", header = TRUE)
dout = dout[1:numSamples,]
dout
I would get a weird table filled with integers attached to commas, which messed up my data conversion to numbers and were giving me those factors.
After I fixed that, the original code worked like a charm.
I appreciate the help from DWin and the opportunity to post this issue here, even though it was a rather silly mistake of my part.
Lesson learned: double-check your data after you wake-up instead of before going to bed.
Thanks,
Paulo.