R: Why won't my function create objects in my environment

Question

I want to write a function that will create n random samples of a data set without replacement.

In this example I am using the iris data set. The iris data set has 150 observations and say I want 10 samples.

My attempt:

#load libraries
library(dplyr)    

# load the data
data(iris)
head(iris)

# name df
df = iris

# set the number of samples
n = 10

# assumption: the number of observations in df is divisible by n
# set the number of observations in each sample
m = nrow(df)/n

# create a column called row to contain initial row index
df$row = rownames(df)

# define the for loop
# that creates n separate data sets
# with m number of rows in each data set

for(i in 1:n){
  # create the sample
  sample = sample_n(df, m, replace = FALSE) 

  # name the sample 'dsi'
  x = assign(paste("ds",i,sep=""),sample)

  # remove 'dsi' from df
  df = df[!(df$row %in% x$row),]

}

When I run this code I get what I want. I get the random samples named ds1,ds2,...,ds10.

Now when I try to turn it into a function:

samplez <- function(df,n){

  df$row = rownames(df)

  m = nrow(df)/n

  for(i in 1:n){

    sample = sample_n(df, m, replace = FALSE) 

    x = assign(paste("ds",i,sep=""),sample)

    df = df[!(df$row %in% x$row),]

  }

}

Nothing happens when I execute 'samplez(iris,10)'. What am I missing?

Thanks

Your function won't return a value unless you explicitly return one. For example, between the last two braces, add `df` to return `df` to the parent environment. — eipi10, Oct 21 '16 at 03:52
@eipi10 true, do you know how to get my samples to appear in the environment? — Zyferion, Oct 21 '16 at 03:57
You mean you want to return not only `df`, but also the value of `sample` for each iteration of the loop? By the way, there's no need for `x = assign(paste("ds",i,sep=""),sample)`. This is equivalent to `x = sample`. But there's not need for that either, because you can do `df = df[!(df$row %in% sample$row),]`. — eipi10, Oct 21 '16 at 04:01
I'd like the samples, ds1,ds2... to be objects appearing in my environment. Kinda like what happens how if I run the loop normally without the function. — Zyferion, Oct 21 '16 at 04:03
I think you should probably return a list to avoid cluttering your environment, where each element of the list is ds1, ds2... — zacdav, Oct 21 '16 at 04:06

score 4 · Accepted Answer · answered Oct 21 '16 at 04:09

Just save the results in a list and return that. Then you'll have a single object, the list of samples, in your global environment, rather than cluttering up your environment with a bunch of similar data frames.

I'm not sure what you're trying to do with df, but here is how to return all of the samples. Let me know what you want to do with df and I can add that as well:

samplez <- function(df,n){

  samples = list()

  df$row = rownames(df)

  m = nrow(df)/n

  for(i in 1:n){

    samples[[paste0("ds",i)]] = sample_n(df, m, replace = FALSE) 

    df = df[!(df$row %in% samples[[i]]$row),]

  }
  return(samples)
}

Yay thank you, just what I needed. I forgot about lists. – Zyferion Oct 21 '16 at 04:15 — Zyferion, Oct 21 '16 at 04:15

R: Why won't my function create objects in my environment

1 Answers1