I apologize if this is a noob question, but I couldn't find any relevant reference -
what is the difference between these two?
If I'd like to read parquet files from hdfs using pyarrow, which one would I use?
The HdfsClient
API was deprecated, you want to use pyarrow.hdfs.connect
now to connect: http://arrow.apache.org/docs/python/filesystems.html#hadoop-file-system-hdfs