0

I have 35 files named PXX physiology.txt where XX is 1 to 35. E.g.

head(P1)
  Time SkinTemp HeartRate RespirationRate
1    0   27.412        70              10
2    0   25.608        70              10
3    4   25.609        70              10
4    5   25.619        70              15
5    8   25.629        76              14
6    9   25.659        78              14

To import one file I normally do:

P1 <- read.table("P1 physiology.txt", header = FALSE, skip=14, nrow = 
                   length(readLines("P1 physiology.txt")) - 16)
colnames(P1)<- c("Time","SkinTemp","HeartRate","RespirationRate")

I'd like to import all 35 into some object in R such that it's in a melted format. I.e. all data from all 35 files is one on top of the next with a column that has a label in it for each data chunk. The reason I'd like to melt it, is so I can plot it based on label using ggplot2 or base.

Edit: Code so far:

I've found this code from here and have tried to alter it but unsuccessfully:

z <- list.files(pattern = ".*\\.txt$")
z <- lapply(1:length(z), function(x) {chars <- strsplit(z[x], "");
         cbind(data.frame(read.table(z[x])), participant = chars[[1]][1]})
z <- do.call(rbind, z)
Community
  • 1
  • 1
HCAI
  • 2,213
  • 8
  • 33
  • 65
  • 2
    Put your read code into an `lapply`: `myFilesList <- lapply(list.files(), function(i) {temp <- your first line here, use i as filename; setNames(temp, c("Time","SkinTemp","HeartRate","RespirationRate"))})` That will be pretty close. This will give you a list of data.frames. Then use `do.call(rbind, myList)` to put them together. – lmo Jan 10 '17 at 16:38
  • 1
    My current way of doing it (using the data.table package, which includes `melt`): http://stackoverflow.com/documentation/data.table/4456/using-list-columns-to-store-data/15561/reading-in-many-related-files#t=201701101645225455209 – Frank Jan 10 '17 at 16:46
  • @Imo myFilesList <- lapply(list.files("*.txt"), function(i) {temp <- read.table(i, header = FALSE,skip=14, nrow = length(readLines(i)) - 16); setNames(temp, c("Time","SkinTemp","HeartRate","RespirationRate"))}) Is this what you mean? – HCAI Jan 10 '17 at 17:34
  • @Frank Thank you, I took patt="txt$" from your page and put it into myFilesList <- lapply(list.files(patt="txt$"), function(i) {temp <- read.table(i, header = FALSE,skip=14, nrow = length(readLines(i)) - 16); setNames(temp, c("Time","SkinTemp","HeartRate","RespirationRate"))}) Should a label column automatically be created? – HCAI Jan 10 '17 at 17:40
  • With `rbindlist` or `bind_rows`, you'll need to use the optional arguments to those functions to get an id/label column (`idcol` for `rbindlist` and `.id` for `bind_rows`, I think). – Frank Jan 10 '17 at 18:00

1 Answers1

2
# 1. this returns all path location of your desired files
# replace .csv with .txt or whichever ext is yours
l = list.files(path = "path_where_all files_present", pattern = ".csv", full.names = T)

# now iterate over each path, read the data , you could use(read.table instead of csv) and then add the 'id' column
# seq_along() returns indices of l
# you can add `setNames()` after reading.
library(dplyr)
l2 = bind_rows(lapply(l, read.csv), .id = "id")
joel.wilson
  • 8,243
  • 5
  • 28
  • 48
  • I'm looking for .txt files that have funny headers so I need to skip at least 14 lines at the top and then the last line at the bottom (because they have some odd character at the end). I used the follow: read.table("P1 physiology.txt", header = FALSE, skip=14, nrow = length(readLines("P1 physiology.txt")) - 16). How do I substitute "P1 physiology.txt" for the values that your list.files l finds? – HCAI Jan 10 '17 at 17:14
  • @HCAI `list.files()` returns the filename alongwith the path. I did't understand the substitute thing that you are aksing. Could you elaborate. Sorry for the trouble – joel.wilson Jan 10 '17 at 17:51
  • Hi Joel. My files have 14 lines of headers so I need to skip those. Then there's always a weird character at the end that also needs skipping. How do I incorporate that into your code? – HCAI Jan 10 '17 at 17:56
  • Also they are .txt files (in case this is a problem) – HCAI Jan 10 '17 at 17:57
  • okay! so `lapply(l, function(x) read.table(x, skip = 14, nrow = length(readLines(x)) - 16), ...)` should work as you suggested @HCAI – joel.wilson Jan 10 '17 at 18:08
  • do you understand how `lapply` works? try printing `lapply(l, function(x) x)` in your console. then keep adding a line of reading code and check – joel.wilson Jan 10 '17 at 18:09
  • Thank you. I've got as far as lapply and that works nicely. But then doing bind_rows() says: Error in bind_rows_(x, .id) : corrupt data frame – HCAI Jan 10 '17 at 18:14
  • if you save just the output of `lapply()` into an object and check if it read well? each dataframe? – joel.wilson Jan 10 '17 at 18:17
  • This must be the problem. How do I look at each data.frame individually? if i do dput(df) it shows me everything. – HCAI Jan 10 '17 at 18:22
  • so `l2 = lapply(l, read.csv)` should return a list of data.frames each with the individual data.frames – joel.wilson Jan 10 '17 at 18:29
  • Great thank you very much! I looked at my files individually and found that some had extra columns... so I've moved them into a different folder. Ideally I'd like to include the files but exclude the extra columns these ones have. Is that very difficult do you think? – HCAI Jan 10 '17 at 18:36
  • yes it's possible. are the unwanted columns always at the end after the columns you want? then just subset them right – joel.wilson Jan 10 '17 at 18:37
  • Ahh good idea! Having said that, the new column(s) has appeared in column 2 and pushed the old ones forward one place. I've read about subsetting today but I think i would have to import the new files into a separate object, alter it and then append it to "l2" afterwards. Is that right? – HCAI Jan 10 '17 at 18:42
  • Thank you very much for your help! :-) – HCAI Jan 10 '17 at 18:53
  • glad to help you out @HCAI – joel.wilson Jan 10 '17 at 19:06