2

I have a data set (called 'data' here) which contains three important kinds of columns: A 'label' column, which corresponds to a list of institutions; a 'group' column that states to which group each institution belongs, and a series of 'measure' columns that indicate a numerical score for each institution on different outcomes/measures.

My task is to write a function which takes user-specified groups and measures, and finds the institution within the given group which has the lowest score on the given measure.

I wrote more or less the following, albeit this is a little pared down and the labels are generic:

func <- function(group, measure) {
  data <- read.csv("data.csv")
  dataSubset <- data[, c(1, 2, 3, 4, 5)]
  headings <- colNames(dataSubset)

  measureInputs <- as.character(c("m1", "m2", "m3")) 
    # A vector of accepted inputs for 'measure', corresponding 
    # roughly to column names in 'dataSubset'

  nameBinding <- as.list(mapply(assign, measureInputs, headings[3:5])) 
    # Assigns each accepted input to a cognate column name in 'dataSubset'

  groupWiselist <- split(dataSubset, dataSubset$Groupcolumn) 
    # Splits 'dataSubset' by individual groups in the group column (column 2) 
    # into distinct groupwise data frames

  inputGroupdata <- groupWiselist$group 
    # Creates a single data frame, corresponding to the subset of dataSubset 
    # picked out by the user specified group

  inputMeasurecolumn <- as.vector(inputGroupdata[[nameBinding[[as.character(measure)]]]]) 
    # Creates a vector of values contained in the user specified column
    # ('measure'), within the values containing the user specified group

  labelMin <- inputGroupdata$Labelcolumn[inputMeasurecolumn == min(inputMeasurecolumn)] 
    # Finds the label within 'Labelcolumn' on the same row as the minimum 
    # value of the user specified column

  return(as.character(labelMin))
}

When I execute this function inputting my own values I get back:

Warning message: In min(inputMeasurecolumn) : no non-missing arguments to min; returning Inf

No such error occurs when I run the code line by line. If I include an extra line in the code like return(inputMeasurecolumn) just after defining inputMeasurecolumn, the function returns NULL; when I run this line by line and input my own values as I go, inputMeasurecolumn returns a sensible vector of exactly the kind I would expect, and min(inputMeasurecolumn) gives me the minimum value of that vector as expected. The only difference I can see is that when running line by line rather than the generic 'measure' variable which goes into the subsetting that forms inputMeasurecolumn, I directly input the name of the measurement. But since in both instances what goes in there are character objects that refer to column names (thanks to nameBinding), I really can't see what's up.

1 Answers1

2
group <- "somegroup"
groupwiseList$group

is not the same as

groupwiseList$somegroup

You probably want to use groupWiselist[,group] instead.

I didn't take the time to fully debug to see if this was the issue but it certainly stuck out to me.

Dason
  • 60,663
  • 9
  • 131
  • 148
  • This seems to have been the problem, yes. Since groupWiselist is a list of data frames rather than a data frame itself, I used `groupWiselist[[group]]` instead which seems to have resolved the issues. Thanks! – Jordan Taylor Jun 28 '15 at 09:49