I have been working with a dataset (called CWNA_clim_vars
) structured so that the variables associated with each datapoint within the set are arranged in columns, like this:
dbsid elevation Tmax04 Tmax10 Tmin04 Tmin10 PPT04 PPT10
0001 1197 8.1 8.9 -5.2 -3.5 34 95
0002 1110 7.7 8 -2.9 -0.6 114 375
0003 1466 5.4 6.4 -4.7 -1.5 199 453
0004 1267 6.1 7.1 -3.6 -0.7 166 376
... ... ... ... ... ... ... ...
1000 926 7.2 10.1 -0.8 2.7 245 351
I've been attempting to on each column run boxplot stats, retrieve the values of the outliers within each column, and write them to a new data frame, called summary_stats
. The code I set up in attempt to achieve this is as follows:
summary_stats <- data.frame()
for (i in names(CWNA_clim_vars)){
temp <- boxplot.stats(CWNA_clim_vars[,i])
out <- as.list(temp$out)
for (j in out) {
summary_stats[i,j] <- out[j]
}
}
Unfortunately, in running this, the following error message is thrown:
Error in `[<-.data.frame`(`*tmp*`, i, j, value = list(6.65)) :
new columns would leave holes after existing columns
I am guessing that it is because the number of outliers varies between columns that this error message is being thrown, as if instead I replace temp$out
with temp$n
, which contains one number only per column, produced is a data frame having these numbers arranged in a single column.
Is there a way of easily remedying this so that I end up with a data frame having rows which are not necessarily of the same length? Thanks for considering my question - any help I would appreciate greatly.