-1

I have a bunch of .csv files that I want to read into a list, then create plots. I've tried the code below, but get an error when trying to cbind. Below is the dput from 2 example files. Each file represents weather data from seperate stations. Ideally I would plot prcp data (column) from each file into one plot window. I don't have much experience working with data in a list.

    file1 <- structure(list(mxtmp = c(18.974, 20.767, 21.326, 19.669, 18.609, 
21.322), mntmp = c(4.026, 5.935, 8.671, 6.785, 3.493, 6.647), 
    prcp = c(0.009, 0.046, 0.193, 0.345, 0.113, 0.187)), .Names = c("mxtmp", 
"mntmp", "prcp"), row.names = c(NA, 6L), class = "data.frame")

.

   file2 <- structure(list(mxtmp = c(18.974, 20.767, 21.326, 19.669, 18.609, 
21.322), mntmp = c(4.026, 5.935, 8.671, 6.785, 3.493, 6.647), 
    prcp = c(0.009, 0.046, 0.193, 0.345, 0.113, 0.187)), .Names = c("mxtmp", 
"mntmp", "prcp"), row.names = c(NA, 6L), class = "data.frame")

I read these files from a directory into a list:

myFiles <- list.files(full.names = F, pattern = ".csv")
my.data <- lapply(myFiles, read_csv)
my.data
names(my.data) <- gsub("\\.csv", " ", myFiles)

I get an error on the line below:

 my.data <- lapply(my.data, function(x) cbind(x = seq_along(x), y = x))

Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 3, 34333

    list.names <- names(my.data)
    lns <- sapply(my.data, nrow)

    my.data <- as.data.frame(do.call("cbind", my.data))
    my.data$group <- rep(list.names, lns)

My plot code:

library(ggplot2)

ggplot(my.data, aes(x = x, y = y, colour = group)) +
  theme_bw() +
  geom_line(linetype = "dotted")
derelict
  • 3,657
  • 3
  • 24
  • 29
  • Do you mean you want to `rbind` them--that is, stack a bunch of data frames, each of which has the same columns? That would be `my.data = do.call(rbind, my.data)`. – eipi10 Dec 09 '15 at 18:40
  • FYI, it would be nice if your example 1) didn't have errors in the `dput` output for `file1` in that 1) `prpcp = (...)` is not a proper vector, and the names for your precip var are not the same (ie, `prpcp` vs `prcp`); 2) did have a call to `library(readr)`; 3) and had a defn for `test` in the `ggplot()` call at the end. – Mark S Dec 09 '15 at 19:00

1 Answers1

2

If you don't need to keep the data frames around for anything else, then you can just read and plot all at once. The column names in your plot code don't match the column names in your data frames. So here's a general approach that you'll need to tailor to your actual data. The code below reads each data frame and creates a plot from it and then returns a list containing the plots:

plot.list = lapply(myFiles, function(file) {
  df = read_csv(file)
  ggplot(df, aes(x = x, y = y, colour = group)) +
    theme_bw() +
    geom_line(linetype = "dotted")
})

# Lay out all the plots together
library(gridExtra)
do.call(grid.arrange, plot.list) 
eipi10
  • 91,525
  • 24
  • 209
  • 285