-1

Am new in spark but i have been trying to access a file and i keep on getting the same error no matter how much i tweak the code for locating the text file on my computer

lines = sc.textFile(r"Documents/python-spark-tutorial/in/word_count.txt").collect()

Traceback (most recent call last): File "", line 1, in File "C:\spark\spark-2.4.4-bin-hadoop2.7\python\pyspark\rdd.py", line 816, in collect sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd()) File "C:\spark\spark-2.4.4-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1257, in call File "C:\spark\spark-2.4.4-bin-hadoop2.7\python\pyspark\sql\utils.py", line 63, in deco return f(*a, kw) File "C:\spark\spark-2.4.4-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: ***An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/C:/Users/Home/Documents/python-spark-tutorial/in/word_count.txt* at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)

Nafeez Quraishi
  • 5,380
  • 2
  • 27
  • 34

2 Answers2

0

Try the below snippet.

sc.textFile("file:///path")
  • Thanks shiva i got the solution after realizing the extension on the file is text rather than txt that was my mistake its now solved – Shem Mukasa Sep 10 '19 at 18:08
0

My problem is solved it was the file extension that i messed up with txt instead of text