0

I am analyzing this dataset it has numeric and factor variable. I would like to know the correlation so I can choose the best variables.

str(data)
$ Ag                    : num [1:1470] 41 49 37 33 27 32 59 30 38 36 ...
 $ Ay              : Factor w/ 2 levels "No","Yes": 2 1 2 1 1 1 1 1 1 1 ...
 $ Bu        : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 2 3 2 3 2 3 3 2 3 ...
 $ Di       : num [1:1470] 1 8 2 3 2 2 3 24 23 27 ...
 $ Ed               : num [1:1470] 2 1 2 4 1 2 3 1 3 3 ...
 $ Ep          : num [1:1470] 1 1 1 1 1 1 1 1 1 1 ...
 $ Em          : num [1:1470] 1 2 4 5 7 8 10 11 12 13 ...
 $ Ge                : Factor w/ 2 levels "Female","Male": 1 2 2 1 2 2 1 2 2 2 ...
 $ Ho             : num [1:1470] 94 61 92 56 40 79 81 67 44 94 ...
 $ J1         : num [1:1470] 3 2 2 3 3 3 4 3 2 3 ...
 $ J2               : num [1:1470] 2 2 1 1 1 1 1 1 3 2 ...

When I execute this(althought I want correlations of all data not only numeric) :

cor(data[sapply(data, is.numeric)])

I return this message:

Warning message:
In cor(data[sapply(data, is.numeric)]) :
  the standard deviation is zero
Raq
  • 75
  • 1
  • 6

1 Answers1

0

It just politely lets you know that you set out to calculate correlation where one of the variables is constant. This often pointless.

Just filter that out aswell


x1 <- data[sapply(data,is.numeric)]
x2 <- x1[sapply(x1,sd)!=0]

cor(x2)

Sirius
  • 5,224
  • 2
  • 14
  • 21
  • Thanks for your answer @Sirius. But I execute this code but I get the next error: Error in match.fun(FUN) : argument "FUN" is missing, with no default – Raq Mar 16 '21 at 11:26
  • I would need to see some minimal real data that gives you this error – Sirius Mar 16 '21 at 12:05