I'm using Cloudera Quickstart VM 5.12
I have a Flume agent moving CSV files from spooldir source into HDFS sink. The operation works ok but the imported files have:
User=flume
Group=cloudera
Permissions=-rw-r--r--
The problem starts when I use Pyspark and get:
PriviledgedActionException as:cloudera (auth:SIMPLE)
cause:org.apache.hadoop.security.AccessControlException: Permission denied:
user=cloudera, access=EXECUTE,
inode=/user/cloudera/flume/events/small.csv:cloudera:cloudera:-rw-r--r--
(Ancestor /user/cloudera/flume/events/small.csv is not a directory).
If I use "hdfs dfs -put ..." instead of Flume, user and group are "cloudera" and permissions are 777. No Spark error
What is the solution? I cannot find a way from Flume to change file's permissions. Maybe my approach is fundamentally wrong
Any ideas?
Thank you