I am trying to convert the below pipelined RDD into dataframe.
Pipelined RDD -> user_rdd
['new_user1',
'new_user2',
'Onlyknows',
'Icetea',
'_coldcoffee_']
I tried to convert using the below code
schema = StructType([StructField('Username', StringType(), True)])
user_df = sqlContext.createDataFrame(user_rdd,schema)
mention_df.show(20)
I am getting the below error:
ValueError: Unexpected tuple 'new_user1' with StructType
I tried using toDF() also:
user_df=user_rdd.toDF()
This time the error encountered is:
TypeError: Can not infer schema for type: <type 'str'>
Let me know if there is a way to convert this to dataframe using pyspark.