I want to calculate Euclidean distance between 12 populations, in each population there are 20 samples and each sample is measured for 100 genes (these are microarray data; the numbers here are just examples).
The equation I found is:
distance = sqrt{[sum(Average of xi -average of yi)^2] /n }, i=1 to n;
where xi
and yi
are the expression of gene i
over two populations with p
and q
samples (x1, x2,…,xp), (y1, y2,…,yq)
, n
is the number of genes.
part of data are pasted below
row.names pop1.1 pop1.2 pop1.3 pop1.4 pop2.1 pop2.2 pop2.3 pop2.4
7A5 5.38194 4.06191 4.88044 5.60383 6.23101 6.53738 4.80336 5.86136
A1BG 5.15155 4.29441 4.59131 4.90026 4.62908 4.48712 4.73039 4.46208
A1CF 4.22396 4.14451 4.41465 3.93179 4.89638 4.66109 4.20918 4.48107
A26C3 12.1969 12.4179 10.9786 11.7659 11.405 11.7594 11.1757 11.8128
How might one calculate these distances in R with this data structure?