1

I have a dataset ("data") that looks like this:

   PatientID Visit  Var1  Var2  Var3  Var4 Var5
1        ID1     0 44.28  4.57 23.56  4.36 8.87
2        ID1     1 58.60  5.34  4.74  3.76 6.96
3        ID1     2 72.44 11.18 21.22  2.15 8.34
4        ID2     0 65.98  6.91  8.57  1.19 7.39
5        ID2     1 10.33 38.27  0.48 14.41   NA
6        ID2     2 69.45 11.18 20.69  2.15 8.34
7        ID3     0 69.16  6.17  10.98  1.91 6.12
8        ID3     1 86.02  3.28  16.29  4.28 5.74
9        ID3     2 69.45 NA 20.69  2.15 8.34
10       ID4     0 98.55 26.75  2.89  3.92 2.19
11       ID4     1 32.66 14.38  4.96  1.13 4.78
12       ID4     2 70.45 11.42 21.78  2.15 8.34

I need to to generate an MDS plot with all datapoints. I also need the visit-points to be linked by a line and coloured as green for visit 1, red for visit 2 and black for visit3 (consistent colours for all individuals).

My code looks like this (quite lenghty, but it doesn't work):

data.cor <- cor(t(data[,3:7]), use = "pairwise.complete.obs", method  = "spearman")

dim(data.cor)

dim(data)

rownames(data.cor) <- paste0(data$PatientID, "V", data$Visit)

colnames(data.cor) <- paste0(data$PatientID, "V", data$Visit)

c <- dist(data.cor)

fit <- cmdscale(c,eig=TRUE, k=2)

ff <- fit$points

ff <- as.data.frame(ff)

ff$pair <- paste0(substr(rownames(ff),1,6))

ff$pair <- factor(ff$pair)

pc.pair.distances <- matrix(nrow = nlevels(ff$pair), ncol = 1)

for(i in 1:nlevels(ff$pair)){

  pair2 <- ff[ff$pair %in% levels(ff$pair)[i] , ]

  pc.pair.distances[i,1] <- sqrt(

    ((pair2[1,1] - pair2[2,1]) * (pair2[1,1] - pair2[2,1])) 
+ ((pair2[1,2] - pair2[2,2]) * (pair2[1,2] - pair2[2,2]))
  )

  rm(pair2)

}

plot(ff[,1], ff[,2], xlab="Principal 1", ylab="Principal 2", type = "n", las = 1)

for(i in 1:nlevels(ff$pair)){

lines(ff[ff$pair == levels(ff$pair)[i],1], ff[ff$pair == levels(ff$pair)[i],2], col = "grey")

}

points(ff[,1], ff[,2], xlab="Coordinate 1", ylab="Coordinate 2", type = "p",
   pch = ifelse(grepl(x = substr(rownames(ff), 7,8), "V1"), 20, 18),
   cex = 1.3)
)

I would really appreciate your help.

scoa
  • 19,359
  • 5
  • 65
  • 80
VasoGene
  • 141
  • 3
  • 12
  • 1
    Can you tell us where is your problem exactly ? Do you have an error ? – maeVeyable Sep 10 '15 at 15:09
  • @maeVeyable: I am not sure if my code makes sense. I don't get an error, but my figure is not as I am expecting. I get only 8 datapoints instead of 12 so I think there is something wrong. It seems like the third visit is not included......Also I would like each visit point to be coloured differently (but consisstently across all individuals). I would also like the three visit points for each individual to be connected by a line. – VasoGene Sep 11 '15 at 08:25
  • Actually all the points are drawn. It's just that some points have the same coordinates. – maeVeyable Sep 11 '15 at 13:21

1 Answers1

0

I suggest you to modify your data.frame in order to add a column for visit number and for indiv id with the function sapply.

ff$visit <- sapply(ff$pair,function(x){substr(x,5,5)})
ff$indiv <- sapply(ff$pair,function(x){substr(x,3,3)})

And then the library ggplot2 is very usefull to plot data. First, you draw the points :

g <- ggplot(ff,aes(V1,V2))+geom_point(aes(color=visit)) 

And then add lines for each individual :

for (i in unique(ff$indiv)){
 g <- g+geom_line(data=ff[ff$indiv==i,],aes(V1,V2))
}
maeVeyable
  • 152
  • 4