1

I need to make a series of folder labelled ppt-01:ppt-48. I then need to move all of the corresponding participant files into the new folders.

Currently all files (10 per ppt) are in one folder, and somewhere in each file name the ppt number is included.. e.g. XX01_040_xxx6_9 Where the 040 relates to ppt number.

I first tried to create a list of folder names using a for loop, but I could not figure out how to save the output

setwd("P:/data")

for (i in 1:48){
  print(paste0("ppt-0", i))
}

**So i used lapply **

x = (1:48)
fun <- function(x){
  paste0("ppt-0", x)
}

output <- lapply(x, fun)
output

path <- "data"

dir.create(output)

I then intend to try to list.files and then use a for loop or lapply / or maybe an if statement to move the files into their corresponding folders, but I'm not quite sure how to approach this.

This does not work and I'm not sure what else to try - any help would be very appreciated.

ubik_01
  • 23
  • 3

1 Answers1

0

You can use this for creating the folders

for (i in 1:48){
    if(i < 10)
    dir.create(paste0("ppt-0", i))
    else dir.create(paste0("ppt-", i))
}

Then to move the files

for (f in list.files("path to your files")){
    n <- gsub('^_0(\\d+)_', '\\1', stringr::str_extract(f, '_\\d+_'))
    to <- paste0("ppt-0", n)
    file.copy(from = paste0('path to your files', f),
              to   = paste0('path to your folder data', to))
}
Mohamed Desouky
  • 4,340
  • 2
  • 4
  • 19
  • thanks a lot for your suggestion! could you please explain what this line ' n <- sub('^0', '', gsub('\\D+', '', f))' is doing / how it's working? – ubik_01 Mar 31 '23 at 09:22
  • This line get the numbers of the file `XXXX_040_xxxx` and removes the left zero, so `n` becomes `40`, so we can move it to the correct folder. – Mohamed Desouky Mar 31 '23 at 10:52
  • thanks for your reply - how does this relate to parts of code? n <- sub('^0', '', gsub('\\D+', '', f)) I don't understand the sub('^0', ''... or '\\D+, '', I know overall that's what it's doing but not sure how – ubik_01 Mar 31 '23 at 11:27
  • First `gsub('\\D+', '', f)` use `regex` where `\\D+` means find any non-numeric value and remove it, see the second argument of `gsub`. Then pass the result which is the number `040` in that case to `sub` to remove the leading zero, see the `regex` `^0`. – Mohamed Desouky Mar 31 '23 at 11:31
  • Thank you for explaining - I have just updated the question, the file names contain more numeric values (eg 'xyz02_040_cde1_2'), with the ppt number being quite central, so will this code not work because there are several numeric values in the title? – ubik_01 Mar 31 '23 at 11:55
  • Try the updated code which will extract the numbers between `_040_` in the middle of the file name. – Mohamed Desouky Mar 31 '23 at 12:13