0

I have already asked similar question on how to create the following figure: enter image description here I was suggested to use splom() function but I do not know how to apply it on my data. I saw examples of splom() function which can be found here and here, but due to my low programming skills I am not able to apply it.

I have 24 time series, belonging to 4 independent groups (4 Pirwise correlation plots). 4 Groups:

1) Frequency = 1 Min. , with belonging time series: AAPL_1m, MSFT_1m, INTC_1m, FB_1m, MU_1m, IBM_1m. 2) Frequency = 2 Min. , with belonging time series: AAPL_2m, MSFT_2m, INTC_2m, FB_2m, MU_2m, IBM_2m. 3) Frequency = 5 Min. , with belonging time series: AAPL_5m, MSFT_5m, INTC_5m, FB_5m, MU_5m, IBM_5m. 4) Frequency = 10 Min. , with belonging time series: AAPL_10m, MSFT_10m, INTC_10m, FB_10m, MU_10m, IBM_10m.

Each pairwise plot should show correlation between time series in each group. For creation of each individual pairwise plot I used following functions:

pairs(cbind(AAPL_1m, MSFT_1m, INTC_1m, FB_1m, MU_1m, IBM_1m),main="Frequency=1 Min.",font.labels = 2, col="blue",pch=16, cex=0.8, cex.axis=1.5,las=1)
pairs(cbind(AAPL_2m, MSFT_2m, INTC_2m, FB_2m, MU_2m, IBM_2m),main="Frequency = 2 Min.",font.labels = 2, col="blue",pch=16, cex=0.8, cex.axis=1.5,las=1)
pairs(cbind(AAPL_5m, MSFT_5m, INTC_5m, FB_5m, MU_5m, IBM_5m),main="Frequency = 5 Min.",font.labels = 2, col="blue",pch=16, cex=0.8, cex.axis=1.5,las=1)
pairs(cbind(AAPL_10m, MSFT_10m, INTC_10m, FB_10m, MU_10m, IBM_10m),main="Frequency = 10 Min.",font.labels = 2, col="blue",pch=16, cex=0.8, cex.axis=1.5,las=1)

If anyone could suggest how to apply splom() function in order to create mentioned/shown figure it will be greatly appreciated.

Also if there is another more suitable function which can integrate for individual pairwise plots (pairs()) in one single figure, I am eager to apply it.

Community
  • 1
  • 1
Robin Hood
  • 259
  • 5
  • 15

2 Answers2

4

Some demodata would have been nice to have, but let's generate some first, just for three variables here:

AAPL_1m<-rnorm(1000)
MSFT_1m<-rnorm(1000) 
INTC_1m<-rnorm(1000)
AAPL_2m<-rnorm(1000)
MSFT_2m<-rnorm(1000) 
INTC_2m<-rnorm(1000) 

In order for the splom() to work you would need to generate a grouping variable. Here are 1000 observation from the 1m group, and another 1000 observation from the 2m group. So the grouping variable would be just a simple vector of 1000 1m value and after them 1000 2m values:

group<-c(rep("1m", 1000), rep("2m", 1000))

In your case the grouping variable might be generated as follows:

group<-c(rep("1m", length(AAPL_1m)), rep("2m", length(AAPL_2m)))

After you have the grouping variable, you might want to bind everything into a sinle dataframe as follows:

dat<-data.frame(AAPL=c(AAPL_1m, AAPL_2m), MSFT=c(MSFT_1m, MSFT_2m), INTC=c(INTC_1m, INTC_2m), group=group)

Once you have a single data frame with the grouping variable giving the groups of observations, you can plot the scatterplot matrices:

library(lattice)
# Three first columns of the data plotted conditional on the grouping
splom(~dat[,1:3]|group)

The resulting plot should appear roughly as follows:

enter image description here

This would need to be generalized to your four batches of data, but it should be straighforward (just generate grouping for four batches, and bind all four separate batches of together). Function splom() also has many more arguments that you can use for, e.g., making the plot prettier.

  • JTT thank you for help. I followed your instructions and have done it. The plot's axis and labels are not pretty, so I will have to improve it in order to pot it outside of the plot. – Robin Hood Aug 01 '14 at 12:49
1

JTT gave an accurate explanation on how splom() should be applied for this problem. Following code represents extension of JTT's code applied to the problem.

group<-c(rep("Frequency = 1 Min.", length(AAPL_1m)), 
rep("Frequency = 2 Min.", length(AAPL_2m)),
rep("Frequency = 5 Min.", length(AAPL_5m)),
rep("Frequency = 10 Min.", length(AAPL_10m)))

dat<-data.frame(AAPL=c(AAPL_1m, AAPL_2m, AAPL_5m, AAPL_10m),
    MSFT=c(MSFT_1m, MSFT_2m, MSFT_5m, MSFT_10m),
    INTC=c(INTC_1m, INTC_2m, INTC_5m, INTC_10m),
    FB=c(FB_1m, FB_2m, FB_5m, FB_10m),
    MU=c(MU_1m, MU_2m, MU_5m, MU_10m),
    IBM=c(IBM_1m, IBM_2m, IBM_5m, IBM_10m),
    group=group)

splom(~dat[,1:6]|group)

The result of the code is following figure: enter image description here

Still, there should be some improvements regarding:

  1. x and Y axis and labels should be set outside (like it is shown in the problem question)
  2. the order of pairwise plots should be changed (left top corner should be "Frequency = 1", right top corner should be "Frequency = 1"...)
Community
  • 1
  • 1
Robin Hood
  • 259
  • 5
  • 15