I am using ConfigParser to read through key values which are passed to my pyspark program. The code works fine when I execute from edge node of a hadoop cluster,with the config file in local directory of edge node. This doesn't if the config file is uploaded to a hdfs path and I try accessing the same using the parser.
The config file para.conf has below contents
[tracker]
port=9801
On local client mode, with para.conf in local directory, to access the values i am using the below.
from ConfigParser import SafeConfigParser
parser = SafeConfigParser()
parser.read("para.conf")
myport = parser.get('tracker', 'port')
The above works fine...
On Hadoop Cluster : Uploaded para.conf file to hdfs directory path bdc/para.conf
parser.read("hdfs://clusternamenode:8020/bdc/para.conf")
this doesn't return anythin, neither does the below by escaping..
parser.read("hdfs:///clusternamenode:8020//bdc//para.conf")
Although using sqlCOntext i can read this file which returns a valid rdd.
sc.textFile("hdfs://clusternamenode:8020/bdc/para.conf")
though am not sure if using configParser can extract the key values from this..
Can anyone advise if configParser can be used to read files from hdfs ? Or is there any alternative ?