0

Here is the code:

    library(sparklyr)
    sc <- spark_connect(master = "local", config = list())
    iris_tbl <- copy_to(sc, iris, overwrite = T)
    newColList <- c("a", "b" , "c" , "d" , " e")
    colnames(iris_tbl) <- newColList 

Error:

Error in colnames<- ( tmp, value = c("a", "b", "c", "d", " e")) : 'dimnames' applied to non-array

PKumar
  • 10,971
  • 6
  • 37
  • 52
Priyanka
  • 11
  • 3

2 Answers2

0

names(iris_tbl) <- newColList works but I think a better answer would utilize %>% and dplyr::rename

kevinykuo
  • 4,600
  • 5
  • 23
  • 31
  • names(iris_tbl) <- newColList is throwing error: Error: org.apache.spark.sql.AnalysisException: cannot resolve 'e' given input columns: [b, d, e, a, c]; at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:60) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57) – Priyanka May 02 '17 at 10:29
0

I've been searching around for this all day. Right now my best solution is to create a custom function that goes direct to the Spark API:

sdf_write_colnames <- function(in_tbl, new_names) {

  sdf_name <- as.character(in_tbl$ops$x)

  in_tbl %>%
    spark_dataframe() %>%
    invoke("toDF", as.list(new_names)) %>%
    sdf_register(name = sdf_name)
}

iris_tbl <- sdf_write_colnames(iris_tbl, c("a", "b", "c", "d", "e"))

head(iris_tbl)

With a bit of effort it could be made to work more like colnames() <-

dougmet
  • 635
  • 4
  • 19
  • I'll leave this up in case it's of use, but I've had a few problems with this. Not sure all the registering should be necessary. – dougmet May 19 '17 at 21:56