0

I used PRIMER v6 to plot my species data before following the steps shown in this video (https://www.youtube.com/watch?v=3I0zpl4v5wo&list=PL-W1dvsmEZ3KISwdig6X18dkQppJfcrYW&index=8) and get the plot below: enter image description here

Now I'm trying to do the same thing using R with the codes that I copied from https://jkzorz.github.io/2019/06/06/NMDS.html:

dat <- Sp_Data[1:15,2:8]
dat <- dat %>% filter_all(any_vars(. != 0))
m_dat <- as.matrix(dat, labels = T)
mds <- metaMDS(m_dat, distance = "bray")

# ggplot
#extract NMDS scores (x and y coordinates)
data.scores = as.data.frame(scores(mds))
#add columns to data frame 
pt <- Sp_Data[which(rowSums(Sp_Data[1:15,2:8])>0),]
data.scores$Condition = pt$Condition
head(data.scores)

library(ggplot2)

xx = ggplot(data.scores, aes(x = NMDS1, y = NMDS2)) + 
  geom_point(size = 4, aes( shape = Condition))+ 
  theme(axis.text.y = element_text(colour = "black", size = 12, face = "bold"), 
        axis.text.x = element_text(colour = "black", face = "bold", size = 12), 
        legend.text = element_text(size = 12, face ="bold", colour ="black"), 
        legend.position = "right", axis.title.y = element_text(face = "bold", size = 14), 
        axis.title.x = element_text(face = "bold", size = 14, colour = "black"), 
        legend.title = element_text(size = 14, colour = "black", face = "bold"), 
        panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
        legend.key=element_blank()) + 
  labs(x = "NMDS1", y = "NMDS2", shape = "Condition")  + 
  scale_colour_manual(values = c("#009E73", "#E69F00")) 

xx

enter image description here

U1 is actually UL, and U2 is UR

My question is, how can the two plots be so different? The stress values are different too. Using R, it is 0.03435446, much lower than that using PRIMER

Here's the the compressed Sp_Data:

    structure(list(SiteConditionReplicate = c("JU1One", "JU1Two", 
"JU1Three", "JU1Four", "JU1Five", "JSOne", "JSTwo", "JSThree", 
"JSFour", "JSFive", "JU2One", "JU2Two", "JU2Three", "JU2Four", 
"JU2Five", "PU1One", "PU1Two", "PU1Three", "PU1Four", "PU1Five", 
"PSOne", "PSTwo", "PSThree", "PSFour", "PSFive", "PU2One", "PU2Two", 
"PU2Three", "PU2Four", "PU2Five"), `Camptandrium sexdentatum` = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0), Diogenidae = c(0, 4, 0, 0, 1, 53, 0, 
2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0, 0), `Nassarius globasus` = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
    `Tellina sp` = c(0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `Marcia opima` = c(0, 
    0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Nereidae = c(0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 17, 18, 17, 12, 4, 3, 12, 
    1, 6, 0, 0, 0, 0, 0), Maldanidae = c(4, 5, 3, 2, 2, 1, 0, 
    0, 0, 0, 1, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0), ...9 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA), Site = c("J", "J", "J", "J", "J", 
    "J", "J", "J", "J", "J", "J", "J", "J", "J", "J", "P", "P", 
    "P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "P", 
    "P"), Condition = c("U1", "U1", "U1", "U1", "U1", "S", "S", 
    "S", "S", "S", "U2", "U2", "U2", "U2", "U2", "U1", "U1", 
    "U1", "U1", "U1", "S", "S", "S", "S", "S", "U2", "U2", "U2", 
    "U2", "U2"), Replicate = c("One", "Two", "Three", "Four", 
    "Five", "One", "Two", "Three", "Four", "Five", "One", "Two", 
    "Three", "Four", "Five", "One", "Two", "Three", "Four", "Five", 
    "One", "Two", "Three", "Four", "Five", "One", "Two", "Three", 
    "Four", "Five")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -30L))
  • Hard to say without the data or the code used to produce the first figure. There are different numbers of points in the two plots. If we cannot verify the issue, it is difficult to provide useful suggestions. Provide `dput(dat)` and the code for the first figure. – dcarlson Jan 14 '23 at 04:30
  • Thank you for replying. I've added a screenshot of my data. I can't provide any code for the first plot as it was created using the PRIMER software, which I no longer have access to. But the procedures should be similar, I think? Both did square-root transformation (in R the autotransformation is square-root, I verified it when running `mds`) – LauraEverdeen Jan 14 '23 at 05:58
  • A picture of the data cannot be used to validate your results. `dput(dat)` should only be 15 rows by 7 columns according to your code. That is not too much to post and it can be used to replicate your results. The `metaMDS()` function runs multiple analyses changing the initial configuration and then selects the best one. If PRIMER just performs a single analysis, the results could be quit different. – dcarlson Jan 14 '23 at 21:22
  • I've included the compressed `Sp_Data`. If I remember correctly, PRIMER also runs multiple iterations to find the best fit and we can set it at `Number of restarts`. I just wanted to check if I'm doing the right thing using R as I will be applying it to another dataset. – LauraEverdeen Jan 15 '23 at 03:43

1 Answers1

0

This is not a complete solution, but it may get you on the right track. You can simplify your code a bit:

dat <- Sp_Data[1:15,2:8]
idx <- which(rowSums(dat)>0)   # Which rows are not all 0's
mds <- metaMDS(dat[idx, ], distance = "bray")  # No need to convert dat
code <- Sp_Data$Condition[idx]   # vector of Condition codes

This will run the analysis and save a vector of the Condition values. I can't follow your ggplot code since the first line throws an error:

data.scores = as.data.frame(scores(mds))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 13, 7

You are trying to convert a list to a data frame, but the number of rows is not the same as the number of columns. You probably want

data.scores <- scores(mds, tidy=TRUE)

There are still problems because Condition only applies to rows, not columns. Perhaps you do not want to plot the columns? In that case use

scores(mds)$sites
dcarlson
  • 10,936
  • 2
  • 15
  • 18
  • Thank you for helping to simplify my codes. However, I tried and still be able to run `data.scores = as.data.frame(scores(mds))`. If I skip the step I can't run the ggplot as the data must be in a dataframe. The dataframe it returns has 13 rows and 3 columns (NMDS1, NMDS2, Condition), which correspond to the samples I have. It seems to me Condition does apply to rows only. In any case `scores(mds)$sites` returns an error `Error in scores(mds)$sites : $ operator is invalid for atomic vectors` – LauraEverdeen Jan 15 '23 at 07:11