-3

I have a matrix containing >200 data points. This is my object x. In a second object (metadata), I have a column (y) with 20 data points. I would like to plot the matrix (object x) against the 20 data points (y) in object metadata

plot(x, metadata$y)

does not work, as x and y lengths differ. Is it possible to plot this?

Matrix x:

    X1  X4  X7  X9
X4  0.7                                                            
X7  0.8 0.5                                                  
X9  0.6 0.6 0.7 

metadata

X1 65.4
X4 9.7
X7 47.4
X9 14.5

metadata$y: 65.4 9.7 47.4 14.5

Tatatam
  • 7
  • 3
  • How would you associate the values in your matrix `x` with the values in `y`? Is each column or row meant to be sampled at a different value of `y`? If you can include sample data and expected output, you'll probably get a more useful response. – Luke C Jul 16 '18 at 21:28
  • The values in metadata$y are connected to a sampling site name (metadata row names), and the matrix in x contains data from comparing every site with every other site. – Tatatam Jul 16 '18 at 21:33
  • So is x a sort of presence/absence or count matrix? Can you provide a sample of x (eg. `dput(x[1:10, 1:10])`) - you can edit your question to include this, as well as an appropriate sample of `y` – Luke C Jul 16 '18 at 21:35
  • I tried to give and example – Tatatam Jul 16 '18 at 21:49
  • Great, thank you! Now, can you clarify what you'd expect your plot to look like? For example, what values would be plotted when `y = 65.4`? – Luke C Jul 16 '18 at 21:57
  • Yes, all X1 values, i.e. 0.7, 0.8 and 0.6. Ideally, metadata$y would be the x axis, starting with 9.7, then 14.5, then 47.5, then 65.4. – Tatatam Jul 17 '18 at 06:19

2 Answers2

0

Here's a tidyverse solution: Alright, with these data frames (structure at bottom of post):

> df
    X1  X4  X7 X9
X4 0.7  NA  NA NA
X7 0.8 0.5  NA NA
X9 0.6 0.6 0.7 NA

> metadata
      y
X1 65.4
X4  9.7
X7 47.4
X9 14.5

First, pull the rownames() from metadata for simplicity:

metadata$x <- rownames(metadata)

> metadata
      y  x
X1 65.4 X1
X4  9.7 X4
X7 47.4 X7
X9 14.5 X9

Now use gather to convert the x matrix into a long format dataframe, using 'x' as the key, just as exists the rownames step above. Next, use left_join to join the metadata to the long dataframe, using the x in both dataframes as the common column.

long <- gather(df, key = "x") %>%
  left_join(metadata, by = "x")

> long
    x value    y
1  X1   0.7 65.4
2  X1   0.8 65.4
3  X1   0.6 65.4
4  X4    NA  9.7
5  X4   0.5  9.7
6  X4   0.6  9.7
7  X7    NA 47.4
8  X7    NA 47.4
9  X7   0.7 47.4
10 X9    NA 14.5
11 X9    NA 14.5
12 X9    NA 14.5

Plot:

plot(value ~ y, data = long, pch = 19)

enter image description here

Data:

df <-
  structure(
    list(
      X1 = c(0.7, 0.8, 0.6),
      X4 = c(NA, 0.5, 0.6),
      X7 = c(NA,
             NA, 0.7),
      X9 = c(NA, NA, NA)
    ),
    .Names = c("X1", "X4", "X7", "X9"),
    row.names = c("X4", "X7", "X9"),
    class = "data.frame"
  )

metadata <-
  structure(
    list(y = c(65.4, 9.7, 47.4, 14.5)),
    .Names = "y",
    row.names = c("X1",
                  "X4", "X7", "X9"),
    class = "data.frame"
  )
Luke C
  • 10,081
  • 1
  • 14
  • 21
0

It looks very much that your x is not a complete matrix, but a lower triangular part of a symmetric dissimilarity matrix. Does class(x) say "dist"?. If so, you can change your dissimilarities to a symmetric matrix with as.matrix(x), and after this it matches the length of y(metadata).

This is the same trick as proposed in another answer, except that this gives you a symmetric matrix instead of having NA in the upper diagonal. You should know which one is the correct decision.

Jari Oksanen
  • 3,287
  • 1
  • 11
  • 15