I have a few data frames.
I need to display basic statistics together with interquartile range (IQR) in one table for all of them.
Unfortunately, summary
function does not return IQR.
On the other hand, fivenum
returns IQR, but cannot (?) be applied on list of data frames and I don't need median.
Since I was unable to find appropriate function, I wrote one myself as follows:
removeXYcol <- function(df)
{
# removes coordinates
drops <- c("X","Y")
withoutXY<- df[,!(names(df) %in% drops)]
withoutXY
}
getStatsTable <- function(listOfDataFrames, df_names = NULL, digits_no = 2)
{
# returns table with statistics (together with IQR which is not returned by summary)
for (df in listOfDataFrames){
df_data <- unlist(removeXYcol(df))
minimum <- round(min(df_data,na.rm = TRUE),digits = digits_no)
maximum <- round(max(df_data, na.rm = TRUE),digits = digits_no)
average <- round(mean(df_data, na.rm = TRUE),digits = digits_no)
IQR_ <- round(IQR(df_data, na.rm = TRUE),digits = digits_no)
toReturn <- c(minimum, maximum, average, IQR_)
if (exists("myStats")) {
myStats <- rbind(myStats, toReturn)
} else {
myStats <- toReturn
}
}
colnames(myStats) <- c("minimum", "maximum", "average", "IQR")
if (is.null(df_names)) {
df_names <- seq(length(listOfDataFrames))
}
rownames(myStats) <- df_names
return(myStats)
}
However I wonder if there's no simpler solution.