0

I'm new to R. I'm working with a script that splits the atlas1006 microbiome data into 3 groups of disease prevalence (low, medium, high) based on country. I want to subdivide each group: low, medium, high, into male and female for each group, i.e. low male, low female, medium male, medium female, high male, high female. I want to keep the 3 existing groups low, medium, high too. This is the code that splits the data into low, medium, high:

# Create mnd variable
# grabs the nationality from phyloseq
test <- get_variable(pseq, "nationality" )

# check you've installed this library
library(forcats)

# this collapses the two variables into a new variable AB in this case. The command for multiple changes is fct_collapse(x, AB = c("A","B"), DE = c("D","E"))
# test <- fct_collapse(test, LOW = c("Scandinavia","EasternEurope") , MEDIUM = c("SouthEurope","CentralEurope", "UKIE"), HIGH = c("US") )

test <- fct_collapse(test, LOW = c("EasternEurope","Scandinavia") , MEDIUM = c("SouthEurope","CentralEurope", "UKIE" ), HIGH = c("US"))

# reorder
test <- factor(test, levels = (c("LOW", "MEDIUM", "HIGH")))
levels(test)

# creates a new variable in the phyloseq called mnd
sample_data(pseq)$mnd = test

# checks that it has worked.
get_variable(pseq, "mnd")

any ideas for how to do the gender split from here?

1 Answers1

1

If you would like to create separate phyloseq objects that say contain the high group and females.you can use subset_samples

subset_samples(pseq, mnd=="high" & gender_variable == "female")