I am running an EMR cluster with Spark/Livy, and would like to test Spark Structured Streaming. I am using the Jupyter Notebook managed service (connects via Livy) however when I try this code in Jupyter:
query = (wordCounts
.writeStream
.queryName("streamingDF")
.outputMode('complete')
.format('memory')
.start())
I receive the following error:
An error occurred while calling o98.start. : org.apache.hadoop.security.AccessControlException: Permission denied: user=livy, access=WRITE, inode="/mnt/tmp":hadoop:hadoop:drwxr-xr-x
How, and to what do I change the permission as Livy seems to be writing temp data to HDFS. I thought with the 'memory' option it writes to the driver and not too disk.