Is there a possibility to read data from IBM GPFS (Global Parallel Filesystem) in Apache Spark ?
My intention is to use something like this
sc.textFile("gfps://...")
instead of
sc.textFile("hdfs://...")
The environment that is intended to be used is the Hortonworks Data Platform. I've read some articles, deploying IBM Spectrum Scale File System that says you can configure on HDP, a connector to GPFS that will give you the ability to read/write to GPFS (maybe something the MAPR-FS has for it's file system). Have anyone done this ?
Thanks