-1

Hello I am just getting started using Sparklyr and I am getting an error when trying to use dplyr to wrangle some data.

library(sparklyr)

sc <- spark_connect(master = "local")

spark_read_csv(sc, "df2_tbl", 
"C:/Users/...csv")

 spark_read_csv(sc, "df_n2_tbl", 
"C:/Users/...csv")

I see the objects "df2_tbl" and "df2_n2_tbl" in the "Connections" tab next to "Environment" and "History" as well as on the Spark UI, but when I run the following

match_cat <- df_n2_tbl %>% 
         filter(var1 %in% df2_tbl) %>% 
         collect()

I get the error -

"Error in eval(lhs, parent, parent) : object 'df_n2_tbl' not found"
Kreitz Gigs
  • 369
  • 1
  • 9

1 Answers1

0

I needed to assign the results of the spark_read_csv() function to an object.

library(sparklyr)

sc <- spark_connect(master = "local")

df1 <- spark_read_csv(sc, "df2_tbl", 
"C:/Users/...csv")

df2 <- spark_read_csv(sc, "df_n2_tbl", 
"C:/Users/...csv")
Kreitz Gigs
  • 369
  • 1
  • 9