3

I am running some pyspark code locally on jupyter hub. my system RAM is 32 GB. Whenever I use show() or count() method after certain operations(say join or union or other) my kernel gets hanged or dies or sometimes it throws exception. But without show() or count() method the code works fine.

I don't know what is causing this issue. Datasize i am handling is around 1 GB.

Any clue on this will be appreciated.

Most of the times error will be :Caused by: java.lang.OutOfMemoryError: Java heap space, but its only when i use show() or count()

  1. Also if I use below method, then also it throws exception
m_f_1.limit(15).toPandas().head()

ERROR:root:Exception while sending command. Traceback (most recent call last): File "/home/tzade/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command raise Py4JNetworkError("Answer from Java side is empty") py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/tzade/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command response = connection.send_command(command) File "/home/tzade/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command "Error while receiving", e, proto.ERROR_ON_RECEIVE) py4j.protocol.Py4JNetworkError: Error while receiving

Tilo
  • 409
  • 1
  • 5
  • 14

0 Answers0