I'm working in healthcare and I need help on how to use R. I explain: I have a set of data like that:
S1 S2 S3 S4 S5
0.498 1.48 1.43 0.536 0.548
2.03 1.7 3.74 2.13 2.02
0.272 0.242 0.989 0.534 0.787
0.986 2.03 2.53 1.65 2.31
0.307 0.934 0.633 0.36 0.281
0.78 0.76 0.706 0.81 1.11
0.829 2.03 0.667 1.48 1.42
0.497 1.27 0.952 1.23 1.73
0.553 0.286 0.513 0.422 0.573
Here are my objectives:
Do correlation between every column
Calculate p-values
Calculate R-squared
Only show when R2>0.5 and p-values <0.05
Here is my code so far (it's not the most efficient but it work):
> e<-read.table(‘Workbook8nm.csv’, header=TRUE, sep=“,”, dec=“.”, na.strings=“NA”)
> f<-data.frame(e)
> M<-cor(f, use=“complete”) #Do the correlation like I want
> library(‘psych’)
> N<-corr.test (f) #Give me p-values
So, so far I have my correlation in M and my p-values in N. I need help on how to show R2 ?
And second part how to make R only show me when R2>0.5 and p-values<0.05 for example ? I used this line :
P<-M[which(m>0.9))]
To show me only when the pearson coefficent is more than 0.9 as a training. But it just make me a list of every values that are superior to 0.9 ... So I don't know between which and which column this coefficient come from. The best would be that it show me significant values in a table with the name of column so after I can easily identify them. The reason I want to do that is because by table is 570 by 570 so I can't look at every p-values to keep only the significant one.
I hope I was clear ! It's my first post here, tell me if I did any mistake !
Thanks for your help !