0

I have a question on how to sample

I have a dataframe called 'inventory' that looks like this (1000 rows)

  inventory_date number_purchases
1       1/1/1986               20
2       2/4/1992               15
3     12/13/2001               10

I want to sample 5 of the rows

This is my code

samplesize <- c(5,10,15,20,25)

for (m in 1:length(samplesize))
{
   mysample <- sample(inventory, samplesize[m], replace=FALSE)
} 

When I run the code, it takes 1000 not a sample of 5, 10, 15, etc. It is ignoring samplesize[m] Why? What is wrong with my code?
It seems straightforward.

Jota
  • 17,281
  • 7
  • 63
  • 93
James Rodriguez
  • 119
  • 3
  • 10
  • 2
    Replace `sample(inventory, ...)` with `inventory[sample(1:nrow(inventory) ...), ]`. You have to be explicit that you're sampling from the rows. – Gregor Thomas Jul 16 '15 at 00:51

1 Answers1

1

In your case, you don't actually want to generate random data because you already have it. Instead, you want to sample 5 rows from your data frame in a random way. Try this code:

// generate 5 random row indices
random.indices <- sample(1:nrow(inventory), 5, replace=FALSE)

// use these random indices to access rows from your data frame
for (m in 1:5) {
    sample.row <- inventory[random.indices[m], ]
    // use this random row in your calculation
}
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Even `for (m in sample(5))` would work - the `sample` is only calculated once. Then just do `inventory[m, ]`. – mathematical.coffee Jul 16 '15 at 00:54
  • I wasn't sure why he had the loop, hence I was afraid to remove it. Yes, I agree with your comment. – Tim Biegeleisen Jul 16 '15 at 00:56
  • thanks for your suggestions. I definitely need the loop because for the data, i need to choose n rows each time. so the first time i loop, i must select 5 random rows, then the next loop, 10 random rows etc I am not sure what was wrong with my code. – James Rodriguez Jul 16 '15 at 01:24
  • Thanks Tim mathematical coffee and Gregor for you quick responses. Life savers!! I used thissamplesize <- c(5,10,15,20,25) for (m in 1:length(samplesize)) { mysample <- inventory[sample(1:nrow(inventory), samplesize[m], replace=FALSE),] } – James Rodriguez Jul 16 '15 at 01:42