1

I'm super new to spark, so my issues might have a "no duh" answer that I can't quite grasp.

Firstly, I downloaded spark 1.5.2 and extracted it. In the python folder, I tried to run pyspark, but it said something along the lines that it needs a main.py, so I copied init.py to main.py and started getting weird syntax errors. I realized I was using python 2.9, so I switched to 2.7 and got a different error:

Traceback (most recent call last):
  File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Python27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "C:\spark-1.5.2\python\pyspark\__main__.py", line 40, in <module>
    from pyspark.conf import SparkConf
ImportError: No module named pyspark.conf

I found this question that looked like the same error here: What to set `SPARK_HOME` to?

So I set up my environment variables as they did (except with C:/spark-1.5.2 instead of C:/spark), but that didn't fix the error for me. Then I realized they were using spark 1.4 from github. So I made a new folder and tried it as they did. I got stuck with the command:

build/mvn -DskipTests clean package

showing the error:

Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Error occurred during initialization of VM
Could not reserve enough space for 2097152KB object heap  

I tried adding "-XX:MaxHeapSize=3g" but no change. Noting the comment "support was removed in 8.0", I downloaded java 7, but that didn't change anything either.

Thanks in advance

Community
  • 1
  • 1
Nate Schultz
  • 181
  • 1
  • 14

0 Answers0