Take a sample without group in dplyr, R

Question

I know how to take a random sample each group from a dataframe using sample_n or sample_frac in dplyr, which can go like this,

dataset %>%
  group_by(user_id) %>%
  sample_n(10)

However, I have a slightly different question. I want to take a random sample from the whole dataset. It should be as simple as this one,

sample_n(dataset,10)

But, because I have used group_by command on the dataset in a previous case, it seems the group_by still takes effect here. The second command is equivalent to the first here.

I wonder how can I remove the effect of group_by and get a random sample from the whole dataset?

score 2 · Accepted Answer · answered Aug 18 '16 at 06:27

2

We can use ungroup() to remove any group variable and then apply the sample_n

dataset %>%
    group_by(user_id)  %>%
    ungroup() %>%
    sample_n(10)

answered Aug 18 '16 at 06:27

akrun

874,273
37
540
662

Take a sample without group in dplyr, R

1 Answers1