RDA with small data set is not performing full analysis

Question

I am trying to perform an RDA analysis with a simple dataset but I keep on having the same errors even though the data is standardized. Can anyone help me understand what is the problem with this model?

`view(data_RDA)
summary(data_RDA)

abiotic.lake<-data_RDA[, c(1:11)] 
biotic.lake<-data_RDA[, c(12:20)]

biotic.lake<-decostand(biotic.lake,"hellinger")
abiotic.lake<-decostand(abiotic.lake,"standardize")

rda.lake<- rda(biotic.lake, abiotic.lake)
plot(rda.lake, type="text") #The error starts at this point
anova.cca(rda.lake) 
summary(rda.lake)
coef(rda.lake)
`

The error I´m getting when I try to run the plot line:

Error in cbind(x$CCA$v, x$CA$v)[, choices, drop = FALSE] : 
  subscript out of bounds

I think the analysis is not being performed because the summary of rda.lake returns only RDA1 results.
The data are being recognized as numeric.
The anova.cca function returns only zeros as residuals and does not show a p-value, what makes me believe there is a problem with the data or the model.
These are the actual tables used and both have 6 rows.

biotic.lake

Cian	Crip	Dino	Cris	Xant	Diat	Eugl	Zigne	Cloro
0.0590634	0.21536114	1.286085	0.01117714	0.00000000	0.17741471	0.1438246	0.04127306	0.23323527
0.0590634	0.21536114	1.286085	0.01117714	0.00000000	0.17741471	0.1438246	0.04127306	0.23323527
0.0590634	0.21536114	1.286085	0.01117714	0.00000000	0.17741471	0.1438246	0.04127306	0.23323527
2.8144055	0.09724492	1.128178	0.02302370	0.03858338	0.01373549	0.9708160	0.90119103	0.08646308
2.8144055	0.09724492	1.128178	0.02302370	0.03858338	0.01373549	0.9708160	0.90119103	0.08646308
2.8144055	0.09724492	1.128178	0.02302370	0.03858338	0.01373549	0.9708160	0.90119103	0.08646308

abiotic.lake

Prof	Temp_H2O	OD	Cond	N_Tot	NO2	NO3	SRP	SIO	Zmax	Zeu
0.0	20.7	8.15	98	230.72	9.28	294.32	15.91	3.72	4.8	4.5
2.0	20.4	7.16	105	228.61	8.56	352.34	8.92	4.49	4.8	4.5
4.8	20.0	5.20	107	190.82	6.82	293.81	11.15	7.82	4.8	4.5
0.0	30.4	9.24	100	610.28	3.46	42.82	36.15	13.17	5.0	2.0
2.0	28.3	6.62	110	612.11	3.63	48.19	32.19	11.94	5.0	2.0
5.0	25.8	2.13	115	560.31	4.69	60.98	35.30	11.03	5.0	2.0

Edit: As required, the output of "dput"

Blockquote

    `dput(data_RDA)
    structure(list(Prof = c(0, 2, 4.8, 0, 2, 5), Temp_H2O = c(20.7, 20.4, 20,  30.4, 28.3, 25.8), OD = c(8.15, 7.16, 5.2, 9.24, 6.62,  2.13), Cond = c(98L,  105L, 107L, 100L, 110L, 115L), N_Tot = c(230.72, 228.61, 190.82, 610.28, 612.11,  560.31), NO2 = c(9.28, 8.56, 6.82, 3.46, 3.63, 4.69), NO3 = c(294.32, 352.34, 293.81, 42.82, 48.19, 60.98), SRP = c(15.91, 8.92, 11.15, 36.15, 32.19, 35.3), SIO = c(3.72, 4.49, 7.82, 13.17, 11.94, 11.03), Zmax = c(4.8, 4.8, 4.8, 5, 5, 5), Zeu = c(4.5, 4.5, 4.5, 2, 2, 2), Cian = c(0.0590634, 0.0590634, 0.0590634, 2.814405487, 2.814405487, 2.814405487), Crip = c(0.215361139, 0.215361139, 0.215361139, 0.097244921, 0.097244921, 0.097244921), Dino = c(1.286084811, 1.286084811, 1.286084811, 1.128178481, 1.128178481, 1.128178481), Cris = c(0.011177144, 0.011177144, 0.011177144, 0.023023705, 0.023023705, 0.023023705), Xant = c(1e-07, 1e-07, 1e-07, 0.038583378, 0.038583378, 0.038583378), Diat = c(0.17741471, 0.17741471, 0.17741471, 0.01373549, 0.01373549, 0.01373549), Eugl = c(0.14382456, 0.14382456, 0.14382456, 0.970816029, 0.970816029, 0.970816029), Zigne = c(0.041273061, 0.041273061, 0.041273061, 0.901191033, 0.901191033, 0.901191033), Cloro = c(0.233235275, 0.233235275, 0.233235275, 0.086463085, 0.086463085, 0.086463085)), class = "data.frame", row.names = c(NA, -6L))

Edit 2: As required, the output of rda.lake:

Blockquote

'rda.lago
Call: rda(X = biotic.lake, Y = abiotic.lake)

              Inertia Proportion Rank
Total         0.01198    1.00000     
Constrained   0.01198    1.00000    1
Unconstrained 0.00000    0.00000    0
Inertia is variance 
Some constraints were aliased because they were collinear (redundant)

Eigenvalues for constrained axes:
    RDA1 
0.011977 '

Edit 3: NAs produced by rda analysis.

Blockquote

'coef(rda.lake)
                 RDA1
Prof      0.003503013
Temp_H2O -0.152503172
OD        0.100134578
Cond      0.020938441
N_Tot    -0.295691269
NO2                NA
NO3                NA
SRP                NA
SIO                NA
Zmax               NA
Zeu                NA '

Can you please post a minmal reproducible example? For exampls, you can do `dput` on your `data_RDA` and then copy and paste the output here. — MacOS, Feb 01 '21 at 20:49
The error das not necessarely mean that the RDA is performed partially. It simply means that the plot function goes past the result object, e.g. the number of rows in a data frame. — MacOS, Feb 01 '21 at 20:51
I posted the tables used along with the code as biotic.lake and abiotic.lake. I am editing the original post with the dput output. — Patricia Nunes, Feb 01 '21 at 21:03

MacOS · Answer 1 · 2021-02-02T12:06:48.493

0

First, I think you question is related to Plotting RDA (vegan) in ggplot.

This error

Error in cbind(x$CCA$v, x$CA$v)[, choices, drop = FALSE] : 
  subscript out of bounds

means that R tries to select an index (either a column or a row index) that does not exist, e.g. say you have 5 rows and you want to select row number 8. Then, you would get the same error.

This error, in particular, means the same but for rda.lake$CCA$v (just substitude the x for your rda.lake). I recreated your issue and was able to plot something, e.g. try plot(rda.lake$CA$Xbar). That said, I do not know what you want to plot. Please look at the object that is returned by rda. This function returns a plethora of variables stored in a list.

edited Feb 02 '21 at 12:06

answered Feb 02 '21 at 08:25

MacOS

1,149
1
7
14

My intention is to plot the biplot with the points and arrows showing the results of the RDA analysis. If I understood right, the plot function is trying to use a parameter of rda.lake that does not exist. When I substituted x for rda.lake the output was a table of RDA coefficients but when running the plot line again, the error is still happening. Is there another way to get this plot? – Patricia Nunes Feb 02 '21 at 11:53
I edited my answer as your question might be a dublicate. Please have a look. – MacOS Feb 02 '21 at 12:07
Thank you for the answer but that is not what I'm looking for. Although it uses the same function (rda) from vegan package, in that question the person is performing a PCA analysis, using only one data set to ordinate the variables. My analysis is of a constrained kind, which means that I use two data sets to verify the influence that the first has on the second. – Patricia Nunes Feb 02 '21 at 13:16
Maybe there is something wrong with the distribution or the variation in data that is not allowing the analysis to be run and then it would make sense that the error is happening because the plot is trying to select information that does not exist. I am editing the post again with the coefficients I got from "coef" function which shows some NAs. – Patricia Nunes Feb 02 '21 at 13:17

RDA with small data set is not performing full analysis

1 Answers1