I'm attempting to run some code from my databricks notebook in an IDE using databrick connect. I can't seem to figure out how to create a simple dataframe.
Using:
import spark.implicits._
var Table_Count = Seq((cdpos_df.count(),I_count,D_count,U_count)).toDF("Table_Count","I_Count","D_Count","U_Count")
gives the error message value toDF is not a member of Seq[(Long, Long, Long, Long)]
.
Trying to create the dataframe from scratch:
var dataRow = Seq((cdpos_df.count(),I_count,D_count,U_count))
var schemaRow = List(
StructField("Table_Count", LongType, true),
StructField("I_Count", LongType, true),
StructField("D_Count", LongType, true),
StructField("U_Count", LongType, true)
)
var TableCount = spark.createDataFrame(
sc.parallelize(dataRow),
StructType(schemaRow)
)
Gives the error message
overloaded method value createDataFrame with alternatives:
(data: java.util.List[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.api.java.JavaRDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.rdd.RDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rows: java.util.List[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
cannot be applied to (org.apache.spark.rdd.RDD[(Long, Long, Long, Long)], org.apache.spark.sql.types.StructType)