So, I am taking the Coursera R Programming course and trying to write the function for the 2nd week's 3rd assignment. I know a lot of people have already put their answers for this assignment online, but I tried writing my own, and can't figure out why I am getting an error. It would be great if someone can offer their opinion.
This function is supposed to " take a directory of data files and a threshold for complete cases and calculates the correlation between sulfate and nitrate (two different columns) for monitor locations (each monitor is coded in a separate csv file) where the number of completely observed cases (on all variables) is greater than the threshold. The function should return a vector of correlations for the monitors that meet the threshold requirement. If no monitors meet the threshold requirement, then the function should return a numeric vector of length 0."
I wrote the function as:
corr<-function(directory,threshold=0) {
directory<-getwd()
fileList<-list.files(pattern=".csv", full.names = TRUE)
for (i in 1:332) {
file<-read.csv(fileList[i])
sulf<-file[["sulfate"]]
nitr<-file[["nitrate"]]
if (sum(complete.cases(nitr&sulf)) < threshold) {
return(numeric())
}else {
return(cor(nitr,sulf))
}
}
}
When I try to source this, it says:
Error in source("corr.R") : corr.R:10:8: unexpected symbol
9:
10: if sum
^
So, I can't figure out what is the problem with using sum function. Thanks a lot!