0

I have 7500 messages, each with corresponding unique ID number. I have split up the messages into the seasons, and have have a block of messages for the seasons: winter 2013 ---- > spring 2014. I wish to create a sample of 1000 messages that is representative of the entire period, so I will take 200 messages from each of the 5 seasons.

I have sampled the unique IDs with the following code:

s1    <-  sample(data$id[w13], size = 200, replace = FALSE)
s2   <-  sample(data$id[sp13], size = 200, replace = FALSE)
s3   <-  sample(data$id[su14], size = 200, replace = FALSE)
s4    <-  sample(data$id[a14], size = 200, replace = FALSE)
s5   <-  sample(data$id[w14], size = 200, replace = FALSE)
and then I append these into one factor of length 1000 with the following code:

id.sample    <-   unlist(list(s1,s2,s3,s4,s5))

and now I would like to retrieve the messages corresponding to those IDs. I am using the following code but this doesn't work.

message.sample <-   data$text[data$id==id.sample]

What am I doing wrong?

WeakLearner
  • 918
  • 14
  • 26

1 Answers1

0

message.sample= data[data$id %in% id.sample] After using this you will get all the columns. From that you can select the columns you want.

Arun Raja
  • 1,554
  • 16
  • 26