I am relatively new to R and I am attempting to write my first multi-step function. Essentially, I want to create a function that takes a directory and searches within that directory to find a certain column (in this case, pollutant). Then find the mean value of that column and remove the NAs. This is what I have so far:
pollutantmean <- function(directory , pollutant , min_id = 1, max_id = 332) {
setwd(directory)
dirdata <- list.files(path=getwd() , pattern='*.csv' , full.names = TRUE) %>% lapply(read_csv) %>% bind_rows
specdata <- dirdata %>% filter(between(ID,min_id,max_id))
polspecdata <- specdata %>% select(pollutant)
polspecdatamean <- polspecdata %>% summarize(mean_pollutant=mean(pollutant,na.rm=TRUE))
}
I feel that I am so close, but the result is an error: Warning message:In mean.default(pollutant, na.rm = TRUE) : argument is not numeric or logical: returning NA. I believe the error is due to the column class being col_double. This may be due to dirdata is created from multiple csv files. Any help would be greatly appreciated. Thank you!
This is the data: zipfile_data