I am facing the error while running the following PySpark Program. Using
OS Windows 10
Java version 8
Spark version 2.4.0
Python version 3.6
CODE:
from pyspark.context import SparkContext
sc = SparkContext.getOrCreate()
textFile= sc.textFile(r"file.txt")
textFile.count()
ERROR:
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<ipython-input-7-99998e5c7b17> in <module>()
----> 1 textFile.count()
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 4, localhost, executor driver): org.apache.spark.SparkException: Python worker failed to connect back.
at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:170)...
Many have same problem but they solved by changing java version to 8 but i am using java version 8 even thought getting error
Any help appreciated.
Thanks.