Here is a base R way with ave
.
The random numbers are drawn between 1
and nrow(dat)
. Setting function sample
argument size = 1
guarantees that all random numbers are equal by id
.
set.seed(2022)
dat$random <- with(dat, ave(id, id, FUN = \(x) sample(nrow(dat), size = 1)))
Created on 2022-03-01 by the reprex package (v2.0.1)
Each id
has only one random
number.
split(data.frame(id = dat$id, random = dat$random), dat$id)
#> $`1`
#> id random
#> 1 1 4
#> 3 1 4
#>
#> $`2`
#> id random
#> 4 2 3
#> 6 2 3
#>
#> $`3`
#> id random
#> 7 3 7
#> 8 3 7
#>
#> $`5`
#> id random
#> 2 5 11
#> 9 5 11
#>
#> $`6`
#> id random
#> 10 6 4
#>
#> $`8v`
#> id random
#> 11 8v 6
#>
#> $`9`
#> id random
#> 5 9 12
#> 12 9 12
Created on 2022-03-01 by the reprex package (v2.0.1)
And the random numbers are uniformly distributed. Repeat the process above 10000 times, table the results and draw a bar plot to see it.
zz <- replicate(10000,
with(dat, ave(id, id, FUN = \(x) sample(nrow(dat), size = 1))))
barplot(table(as.integer(zz)))

Created on 2022-03-01 by the reprex package (v2.0.1)
Data
dat <- read.table(header = T, text = "id var1 var2
1 a 1
5 g 35
1 hf 658
2 f 576
9 d 54546
2 dg 76
3 g 5
3 g 5
5 gg 56
6 g 456
8v g 6
9 e 778795")
Created on 2022-03-01 by the reprex package (v2.0.1)