-3

I have a few files with zoo objects that look like this (each file starts and ends at different dates):

           code pp  
1942-06-01 4016  0  
1942-06-02 4016  NA  
1942-06-03 4016  0  
1942-06-04 4016  0  
1942-06-05 NA    0  
1942-06-06 NA    0

I want to do a correlation matrix between the pp of all files for the months of September, October and November (showing the codes so I can identify who is who). I cannot use the list.files function (code kindly supplied by Joran in Correlation matrix between different files) because of the NAs in the code column. So I came up with the folowing code:

files <- list.files(pattern=".csv")
xx<-read.zoo(files[1],sep=",", header=TRUE,index.column=1)  
name<- as.name(xx$code[[1]])  
colnames(xx) <- c("code", name)  
x<-xx[months(time(xx), TRUE) %in% c("Sep", "Oct", "Nov")]  
yy<-read.zoo(files[2],sep=",", header=TRUE,index.column=1)  
name<- as.name(yy$code[1])  
colnames(yy) <- c("code", name)  
y<-yy[months(time(yy), TRUE) %in% c("Sep", "Oct", "Nov")]  
CET<-merge(x, y, all = TRUE, fill = NA, check.names=FALSE)  
for (i in 3:length(files))  
{
  z<-read.zoo(files[i],sep=",", header=TRUE,index.column=1)  
  name<- as.name(z$code[1])  
  colnames(z) <- c("code", name)  
  CET<-merge(CET, z, all = TRUE, fill = NA, check.names=FALSE)  
}  
a<-1:(dim(CET)[2])  
even <- a[ a%%2 == 0 ]    
# saves the precipitation column (even numbers) and discards the code ones
dat<-CET[,even]
c.mat<-cor(dat,use="pairwise.complete.obs" )

But something is wrong: in the correlation matrix some of the column/rows names have an extra ".z" or ".CET" and, most importantly the correlation coefficients are not correct! I cannot find the problem so any help finding the problem or proposing a simpler code to do this will be very appreciated!

Community
  • 1
  • 1
sbg
  • 1,772
  • 8
  • 27
  • 45
  • -1 not reproducible and "not correct" is not a sufficient explanation of what is wrong. – G. Grothendieck Jun 18 '11 at 21:26
  • @Grothendieck - "not correct" because the values are not the same as when calculated manually/using minitab. Sorry but don't really know how to make this question reproducible... – sbg Jun 18 '11 at 22:51

1 Answers1

0

I don't know why, but if instead of extracting the month I want from each file and then merge them in 1 file, I merge the files and only then extract the months that I want the values of the correlations are correct! What I mean is:

files <- list.files(pattern=".csv") x<-read.zoo(files[1],sep=",", header=TRUE,index.column=1)
y<-read.zoo(files[2],sep=",", header=TRUE,index.column=1)
CET<-merge(x, y, all = TRUE, fill = NA, check.names=FALSE)
for (i in 3:length(files))
{
z<-read.zoo(files[i],sep=",", header=TRUE,index.column=1)
CET<-merge(CET, z, all = TRUE, fill = NA, check.names=FALSE)
}
a<-1:(dim(CET)[2])
even <- a[ a%%2 == 0 ]
dat<-CET[,even]
dat.aut<-dat[months(time(dat), TRUE) %in% c("Sep", "Oct", "Nov")]
c.mat<-cor(dat.aut,use="pairwise.complete.obs" )

sbg
  • 1,772
  • 8
  • 27
  • 45