I want to calculate the correlation matrix of a Spark table
in R, I tried using cor()
has in R, but it does not work, here the code:
library(sparklyr)
library(dplyr)
sc <- spark_connect(master = "local")
flights_tbl <- copy_to(sc, nycflights13::flights, "flights")
data = flights_tbl
numeric_data = select_if(datos,function(col) is.numeric(col))
Then I tried cor(numeric_data)
and this is what I get:
>cor(numeric_data)
Error in cor(numeric_data) : supply both 'x' and 'y' or a matrix-like 'x'
I am using
Spark 2.0.2
dplyr 0.7.2
sparklyr 0.7.0-9000
then how can I get the correlation matrix