2

I have different data sets that all have several thousand rows and that all have the same structure with 5 columns. A subset of my data looks like this:

INr         FNr         TM          I1              I2
1           1           B2          1598,45         0,14
2           1           B2          930,40          0,11
3           1           B2          107,86          0,04
4           1           B2          881,09          0,11
7           1           B3          2201,98         0,15
8           1           B3          161,30          0,04
9           1           B3          1208,14         0,17
4           2           B3          831,75          0,12
5           2           B3          1027,41         0,14
7           2           B3          2052,16         0,15
8           2           B3          159,63          0,05
9           2           B4          1111,49         0,16
10          2           B4          1312,15         0,12
1           3           B4          863,79          0,10
2           3           B4          104,06          0,04
3           3           B4          816,02          0,11
4           3           B4          1053,02         0,14
5           3           B5          132,32          0,03
6           3           B5          2059,03         0,14
7           3           B5          153,49          0,04
8           3           B5          1118,69         0,15
9           3           B5          1632,66         0,18
10          3           B5          1302,15         0,12

I now have to filter this data frame for all the different values of TM whicha are always a combination of a letter and a number from 1 to 12 (e.g. A1, B2, B3, C4, D6, F8,...). However, not all the letters and not all the numbers are always present as also shown in the table above, where A1 to A12 and B1 are missing.

To filter and save the data, I have already coded a short script that works perfectly fine, but in this way, I have to adapt the parameter for the sorting and the resulting output file every time:

library(tidyverse)
df %>%
  filter(TM == "B2") %>%
  write.csv(file = "C:/Users/Desktop/B2.csv", row.names = FALSE)

I am sure that there would be a possibility to solve this problem with a for loop but as I am not yet that experienced with the use of loops, I would like to ask you, if you could help me with this issue.

Thank you very much!

Hotshot
  • 143
  • 1
  • 14
  • 4
    Possible duplicate of [Split dataframe into multiple output files](https://stackoverflow.com/questions/10002021/split-dataframe-into-multiple-output-files) – Niek Jan 03 '19 at 13:04

1 Answers1

2

You should split your data.frame in a list, each element of the list then is printed:

# here you split by the variable you want:
listed <- split(data, data$TM)
 listed
# here the partial result
$B2
  INr FNr TM      I1   I2
1   1   1 B2 1598,45 0,14
2   2   1 B2  930,40 0,11
3   3   1 B2  107,86 0,04
4   4   1 B2  881,09 0,11

$B3
   INr FNr TM      I1   I2
5    7   1 B3 2201,98 0,15
6    8   1 B3  161,30 0,04
7    9   1 B3 1208,14 0,17
8    4   2 B3  831,75 0,12
9    5   2 B3 1027,41 0,14
10   7   2 B3 2052,16 0,15
11   8   2 B3  159,63 0,05
...

Then R is a vectorized language, so you do not need in this case a loop, but you can use one of the apply functions, to apply to each of the list a function, in this case, write.table:

sapply(names(listed),  # here the object to sapply the function 
       # here the function: you have to create the path with paste0
       # adding the x to have the correct names
       function (x) write.table(listed[[x]], file=paste0("C:\\Users\\Desktop\\",x, ".csv")))

With data:

data <- read.table(text = "INr         FNr         TM          I1              I2
1           1           B2          1598,45         0,14
                   2           1           B2          930,40          0,11
                   3           1           B2          107,86          0,04
                   4           1           B2          881,09          0,11
                   7           1           B3          2201,98         0,15
                   8           1           B3          161,30          0,04
                   9           1           B3          1208,14         0,17
                   4           2           B3          831,75          0,12
                   5           2           B3          1027,41         0,14
                   7           2           B3          2052,16         0,15
                   8           2           B3          159,63          0,05
                   9           2           B4          1111,49         0,16
                   10          2           B4          1312,15         0,12
                   1           3           B4          863,79          0,10
                   2           3           B4          104,06          0,04
                   3           3           B4          816,02          0,11
                   4           3           B4          1053,02         0,14
                   5           3           B5          132,32          0,03
                   6           3           B5          2059,03         0,14
                   7           3           B5          153,49          0,04
                   8           3           B5          1118,69         0,15
                   9           3           B5          1632,66         0,18
                   10          3           B5          1302,15         0,12", header = T)
s__
  • 9,270
  • 3
  • 27
  • 45