I am able to run a pyspark in yarn client mode in a laptop and I am trying to setup it up in another laptop. However, this time I can't get it running.
When I try to start pyspark in yarn client mode, it gives me the following error. I am using dynamic resource allocation, have set SPARK_EXECUTOR_MEMORY to be less than yarn container memory. I am using hadoop 2.6.4, spark 1.6.1, ubuntu 15.10
Is it possible that the error is due to network issues?
16/06/12 01:49:34 INFO scheduler.DAGScheduler: Executor lost: 1 (epoch 0)
In [1]: 16/06/12 01:49:34 INFO cluster.YarnClientSchedulerBackend: Disabling executor 1.
16/06/12 01:49:34 INFO storage.BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster.
16/06/12 01:49:34 INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, 192.168.2.16, 37900)
16/06/12 01:49:34 ERROR client.TransportClient: Failed to send RPC 9123554941984942265 to 192.168.2.16/192.168.2.16:47630: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/06/12 01:49:34 INFO storage.BlockManagerMaster: Removed 1 successfully in removeExecutor
16/06/12 01:49:34 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to get executor loss reason for executor id 1 at RPC address 192.168.2.16:47640, but got no response. Marking as slave lost.
java.io.IOException: Failed to send RPC 9123554941984942265 to 192.168.2.16/192.168.2.16:47630: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/06/12 01:49:34 ERROR cluster.YarnScheduler: Lost executor 1 on 192.168.2.16: Slave lost
16/06/12 01:49:34 INFO cluster.YarnClientSchedulerBackend: Disabling executor 2.
16/06/12 01:49:34 INFO scheduler.DAGScheduler: Executor lost: 2 (epoch 1)
16/06/12 01:49:34 INFO storage.BlockManagerMasterEndpoint: Trying to remove executor 2 from BlockManagerMaster.
16/06/12 01:49:34 ERROR client.TransportClient: Failed to send RPC 8690255566269835148 to 192.168.2.16/192.168.2.16:47630: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/06/12 01:49:34 INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(2, 192.168.2.16, 41124)
16/06/12 01:49:34 INFO storage.BlockManagerMaster: Removed 2 successfully in removeExecutor
16/06/12 01:49:34 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to get executor loss reason for executor id 2 at RPC address 192.168.2.16:47644, but got no response. Marking as slave lost.
java.io.IOException: Failed to send RPC 8690255566269835148 to 192.168.2.16/192.168.2.16:47630: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)