I want to convert my results1 numpy array to a dataframe. For the record, results1 looks like
array([(1.0, 0.1738578587770462), (1.0, 0.33307021689414978),
(1.0, 0.21377330869436264), (1.0, 0.443511435389518738),
(1.0, 0.3278091162443161), (1.0, 0.041347454154491425)]).
I want to convert the above to a pyspark RDD with columns labeled "limit" (the first value in the tuple) and "probability" (the second value in the tuple).
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('YKP').getOrCreate()
sc=spark.sparkContext
# Convert list to RDD
rdd = sc.parallelize(results1)
# Create data frame
df = sc.createDataFrame(rdd)
I keep getting the error
AttributeError: 'RemoteContext' object has no attribute 'createDataFrame'
when I run this. I don't see why this is giving me an error and how do I fix this?