I'm at a loss at this point. I'm trying to run PySpark/SparkR on Apache HUE 4.3, using Spark 2.4 + Livy Server 0.5.0. I've followed every guide I can find, but I keep running into this issue. Basically, I can run PySpark/SparkR through command line, but HUE, for some reason, does the following:
- Ignores all Spark configuration (executor memory, cores, etc) that I have set in multiple places (
spark-defaults.conf
,livy.conf
andlivy-client.conf
) - Successfully creates session for both PySpark and SparkR, yet when you try to do anything (even just
print(1+1)
), I getInterpreterError: Fail to start interpreter
- Actually works with Scala on HUE. Scala works, but PySpark and SparkR do not on HUE (presumably since Scala is java-based).
Any configuration needed I can provide. This is driving me absolutely insane.
I also cannot interact with PySpark through the REST API either, same InterpreterError. This leads me to believe it's more Livy Server based than HUE.