I'm new to PySpark/Spark and using a text file contains just 5 lines of palin text for practicing. Below is the code:
text_rdd = sc.textFile(file_path)
text_rdd.collect() # This collect() works fine and showing the data
text_rdd.flatMap(lambda x: x.split(" ")).collect() #This collect() throwing below error
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 1.0 failed 1 times, most recent failure: Lost task 1.0 in stage 1.0 (TID 3) (Satish executor driver): java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1073)
at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:167)