0

i just started using R. I merged 54 files (54 subjects) each 7 variables (data from a behavioral experiment) into one R dataframe.

i now have the variables trial (1 to 210) reaction-time, choice and others in one table starting from subject 1 to 54.

my problem is that i do not have a subject variable (subject ID).

is there any easy way to add a subject variable to the dataframe where already all subjects are in (subject 1 to the first 210 trials (rows) subject 2 to next 210 rows... and so on)

my plan is to use maybe a loop function that adds a variable with the value 1 to the dataset from row 1 to 210 and than a variable 2 to the next 210 rows and so on until subject 54 (row 10920).

thank you very much for your help and best wishes

Ben Jonathan
  • 29
  • 1
  • 5
  • I suspect that you didn't `merge`, but `rbind`. There are easy and efficient solutions for creating this ID column during row binding of the data.frames. But you don't share any code ... – Roland Jun 02 '16 at 13:12

2 Answers2

2

you can create a vector of numbers you need like that

x<-rep(1:54,each=210)

and then cbind(x,your data frame)

Tomas H
  • 713
  • 4
  • 10
  • 1
    perhaps better `df$id <- rep(1:54,each=210)` – rafa.pereira Jun 02 '16 at 13:19
  • thank you! that was way easier than i thought! if you have the time...if i now want to add another variable (1 or 0) in another column to the subjects. for example a 1 to subject 20 or a 0 to subject 21. that would help very much. in my mind is something like an if function. if subject == 20,21, or 40 add variable 1 ? – Ben Jonathan Jun 02 '16 at 13:20
  • I don't know exactly how you mean it. E.g for files 1,3,5,7..... add 1 and for files 2,4,6,8...... add 0 ? or you just want to have basically empty column and you decide when you want to add 1 or 0 to some subjects ? – Tomas H Jun 02 '16 at 13:25
  • ehm, now i have a big dataframe with all of the variables already in and the subject ID and i want to add another variable (value 1 or 0 for group membership) to specific subjectsID´s. right now i have a dataframe with the first 210 rows for subject 1 then subject 2. and i want to add another variable for the groupmembership to half of the 54 subjects. ehm yes i already now the group membership for the subjects. so i need to add a 1 for example subject 1,3,5,10,11,15 ... and a 0 for all of the others. so what you said first! – Ben Jonathan Jun 02 '16 at 13:52
  • If you create this vector y<-rep(1:0,times=27,each=210) you get sequence 210 times 1 then 210 times 0 then 210 times 1 and this will repeat 27 times because 54:2 = 27. and you can cbind with you data frame Or another option but adjust it to your needs lets say you have data frame df<-data.frame(x, empty=vector(length=length(x))) where x is vector I created in answer then empty will be vector of the same length and witch this code df[df$x %in% c(1,3,5,10,15,19),2]<-1 you can choose in c for what ID values you want to assing 1 – Tomas H Jun 02 '16 at 14:12
  • hey Thomas, thank you very much. Due to holidays i was away from work. I do understand your first answer, but that doesnt work, because subjects are not in the right order to assign 0 and 1. About the second option I´m not sure how to do it. Subjects <- c(1,2,7,11,13,14,16,,33...) i want to assign a 1 each 210 times for each trial and the other Subjects c(3,4,5,6,8,9,...) i want to assign a 0. so it would work if i could code something like dataframe$drugcondition <- assign 0 to subject c(1,2,7,11... each 210 times; assign 1 to subjects drugPlacebo c(2,3,4...) each 210 times. – Ben Jonathan Jun 22 '16 at 14:48
  • i think thats what you ment by the second option, im still trying but doesnt work... – Ben Jonathan Jun 22 '16 at 14:48
  • okay, i managed to do it. thank you! but why it does work, i dont understand..... – Ben Jonathan Jun 22 '16 at 15:43
0

You can use paste() and rep() function to add a new column named Subject to your data frame named your_data

 > your_data$Subject <- paste("Subject_",rep(1:54,each=210),sep="")

The Subject Variable will be added at the end, so instead another alternative would be doing this in two steps, second step is using cbind()

> Subject <- paste("Subject_",rep(1:54,each=210),sep="")
> your_data <- cbind(Subject,your_data)
Sowmya S. Manian
  • 3,723
  • 3
  • 18
  • 30
  • thank you very much for your help until now. i now do have a subject variable and a drugcondition variable. what i want to do now is to add the real subject variable from the questionaire, because subjects 20 and 30 are missing in the data. that means i have 52 subjects now from 1 to 52 but their real ids are from 1 to 54 with subject 20 and 30 missing. my plan is to add another variable: dataset$realsubjectID <- rep (1:19,each=210; 21:29, each=210, 31:54, each = 210) but that does not work, is there a function like this? where i can add subject id´s from 1 to 54 with skipping 20 and 30 ?ty – Ben Jonathan Jun 22 '16 at 16:11
  • `dataset$realSubjectID <- paste("Subject_",rep(c(1:19,"ID_Missing",21:29,"ID_Missing",31:54),each=210),sep="")` And if the previous answer worked for you, please accept it. Thank you. – Sowmya S. Manian Jun 22 '16 at 16:20
  • thank you! because subject 20 was totally missing, i needed to skip this "ID_Missing" thing, because otherwise there would have been more rows then there are in my data. i just deleted these two "ID_Missing" and it worked out well! – Ben Jonathan Jun 28 '16 at 13:14
  • Oh I thought you need them too, to alight it to your original data. If its there in original, you can totally skip that. Great!! – Sowmya S. Manian Jun 28 '16 at 15:19