0

For a research I'm conducting I need to analyze chemical data to "group" or "see how the sampling points lay out based on the chemical data".

This is the reproducible dataset:

Chem<- data.frame(
  stringsAsFactors = FALSE,
               Sample = c("42_L2","17_L2","17_L1",
                          "VS_1","VS_3","VS_D1","VS_3L","17_WL","42_WL"),
                Al = c(NA, NA, NA, NA, NA, NA, 51.982, 49.129, 25.848),
                Sb = c(0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.001, 0.285, 0.001),
                Ag = c(NA, NA, NA, NA, NA, NA, 0.005, 0.005, 0.005),
                As = c(21, 13, 5.3, 0.1, 0.1, 0.1, 0.005, 5.501, 8.325),
                Be = c(0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.001, 0.001, 0.001),
                 B = c(NA, NA, NA, NA, NA, NA, 0.641, 0.1244, 0.1),
                Cd = c(0.9, 1.1, 0.3, 0.1, 0.1, 0.1, 0.622, 0.503, 0.049),
                Co = c(0.1, 0.1, 0.1, 0.1, 0.1, 58, 0.02, 0.02, 0.02),
            Cr_tot = c(0.2, 0.7, 0.3, 0.1, 0.1, 71, 1.4, 0.483, 0.02),
             Cr_VI = c(0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 5e-05, 0.0032, 1e-04),
                Fe = c(NA, NA, NA, NA, NA, NA, 285.626, 227.53, 12.991),
                Mn = c(NA, NA, NA, NA, NA, NA, 4.639, 3.073, 0.568),
                Hg = c(1, 1, 0.4, 0.1, 0.1, 0.1, 0.001, 9e-04, 0.001),
                Ni = c(0.1, 0.1, 0.1, 0.1, 0.1, 1231, 0.842, 0.731, 0.01),
                Pb = c(149, 55, 29, 3610, 576, 0.1, 28.003, 8.212, 11.723),
                Cu = c(46, 34, 11, 123, 50.6, 82, 9.036, 1.808, 0.052),
                Se = c(2, 0.9, 0.6, 0.1, 0.1, 0.1, 0.001, 0.036, 0.952),
                Sn = c(6.2, 3.2, 2, 0.1, 0.1, 0.1, NA, NA, NA),
                Tl = c(0.8, 0.1, 0.4, 2.6, 0.1, 0.1, 0.001, 0.119, 0.361),
                 V = c(0.1, 0.1, 0.1, 0.1, 0.1, 0.1, NA, NA, NA),
                Zn = c(40, 168, 74, 284, 150, 0.1, 166.171, 199.641, 2.053),
                   HC = c(19687,27138,17664,
                          74400,34130,1310,88.3,2910,9480),
                pH = c(8.75, 6.3, 6.95, NA, NA, NA, 8, 1.72, 7.7),
          Salinity = c(0.265, 1.75, 1.59, NA, NA, NA, NA, 8, 0.204),
             Redox = c(-99, 35, -8, NA, NA, NA, NA, 303.3, -276),
               NH4 = c(0.081, 0.1, 0.13, NA, NA, NA, NA, 0.05, 0.1)
   )

As you can see I have many NA values. The reason is because they are chemical analyzes that derive from different researches, and/or conducted on different matrices for which different parameters are sought. Unfortunately the sampling points are only 9.

What I had thought was a NMDS, trying to remove all columns containing NA values, but losing too much information.

Not having too many programming skills and basics of statistical analysis, I wanted to ask you: what would you recommend for ordering my sampling points with the data in my possession? (NMDS, PCA, PCoA etc)

0 Answers0