I'm trying to create a DataFrame from 2 custom sentences, just to test. But from the code I made I'm unable to create it.
spark = SparkSession.builder.appName('first').getOrCreate()
df = spark.createDataFrame(
[
(0, "Hi this is a Spark tutorial"),
(1, "This tutorial is made in Python language")
], ['id', 'sentence']
)
df.show()
This gives me this error:
Py4JJavaError: An error occurred while calling o73.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2) (executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
I tried to create a schema
schema = StructType(
[StructField("id", IntegerType(), True),
StructField("sentence", StringType(), True)]
)
and pass it like an argument schema=schema
but it is the same roadend.