1

I have a function I am trying to make in R that looks like this:

diff_abund <- function(Data, Rank, Taxa) {
subset_taxa(Data, Rank == Taxa)
}

da= diff_abund(frbc1_02, Phylum, "Acidobacteria") 

And I get the error:

Error in eval(e, x, parent.frame()) : object 'Rank' not found 

The problem appears to be with the Rank == Taxa part. If I remove that from the function like so:

diff_abund <- function(Data) {

subset_taxa(Data, Phylum == "Acidobacteria")

}
da= diff_abund(frbc1_02) 

The function works normally. The dataset is an S4 object that only works in the package phyloseq. "Rank" is basically a group of vectors ranging from Kingdom down to Species. Not sure what you would call that. Any reason this could be happening? Thank you, Sam

  • So what you want to do is to filter `Data` such that only rows where `Rank == "Some name" ` remain? do you even have a column called `Rank` in your `Data` ? – dvd280 May 31 '20 at 05:12
  • So there actually isn't a column named Rank which may be the issue. However, the function: subset_taxa(frbc1_02, Phylum == "Acidobacteria") where Phylum is the Rank, actually works. So in theory my function should work corect? Also, just to add the function subset_taxa comes from a package, phyloseq, and already has a predefined set of commands the package developers created. I am simply trying to use their function within my own function to streamline things better. – Sam Degregori Jun 01 '20 at 17:53
  • Also, the dataset, frbc1_02, is what R calls an S4 object. Not sure if that is useful or not. – Sam Degregori Jun 01 '20 at 17:56
  • I've updated my question in the original post – Sam Degregori Jun 01 '20 at 18:14

2 Answers2

0

You need to substitute your condition and evaluate it in Data. I don't know subset_taxa, but this should work in the same way as substitute. In the following I use mtcars as example data, which comes with R.

diff_abund <- function(Data, Rank, Taxa) {
  cond <- substitute(Rank == Taxa)
  e <- eval(cond, Data)
  subset(Data, e)
}
diff_abund(mtcars, cyl, "6") 
#                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
# Merc 280       19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
# Merc 280C      17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
# Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6

In your subquestion you probably want the arguments to be fixed and just define Taxa in the function call dynamically. You could change the order of the arguments, and define the fixed ones using =.

FX <- function(Taxa, Data=mtcars, Rank=mtcars$cyl) {
  subset(Data, eval(substitute(Rank == Taxa), Data))
}
FX("6") 
#                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
# Merc 280       19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
# Merc 280C      17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
# Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6

Note: It is better not use F as function name, since it's shorthand for FALSE. You may check if a name is already in use using ?, e.g. ?F.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
-1

First thing: whenever you post, provide a reproducible example: either with sample data you create; or with a small part of your own data (using dput()). Otherwise, we're limited in how we can help.

Generally speaking, quotations would go outside of the function. If year was supposed to be character, and I had some function where I wanted to add year to a dataset I'd write


    random.fun <- function(dat, yr){
    dat%>%
    mutate(yr = year)
    }

and then I'd create a variable and call that function, assigning it to the variable.


    df <- random.fun(dataset, "2007")

There are other problems with your code, and again, it's hard to help without knowing what you really want to do. But below, it doesn't look as though subset_taxa is a meaningful function.


    diff_abund <- function(Data,Rank,Taxa) {

    FTaxa=subset_taxa(Data, Rank == "Taxa")

    }
    diff_abund(frbc1_02, Phylum, "Acidobacteria")

Generally, I prefer writing with dplyr

    library(tidyverse)

    diff_abund <- function(data, rank, taxa){
    data%>%
    filter(rank==taxa)
    }

    newdf <- diff_abund(df, phylum, "taxa")

James
  • 459
  • 2
  • 14