I have the following RDD of Rows. As can be seen each field is a string type
[Row(A='6', B='1', C='hi'),
Row(A='4', B='5', C='bye'),
Row(A='8', B='9', C='night')]
I want to convert this RDD into a dataframe with IntegerTypes for column A and B
dtypes = [
StructField('A', IntegerType(), True),
StructField('B', IntegerType(), True),
StructField('C', StringType(), True)
]
df = spark.createDataFrame(rdd, StructType(dtypes))
I get the following error:
TypeError: field A: IntegerType can not accept
object '6' in type <class 'str'>
How can i succesfully convert '6' into an IntegerType?