Is it possible to have spark take a local file as input, but process it distributed?
I have sc.textFile(file:///path-to-file-locally)
in my code, and I know that the exact path to the file is correct. Yet, I am still getting
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 14, spark-slave11.ydcloud.net): java.io.FileNotFoundException: File file:/<path to file> does not exist
I am running spark distributed, and not locally. Why the error exist?