I am trying to create an external table in Greenplum database on an Amazon ec2-cluster. My source file is parquet and stored in s3. My question is:
What protocol should I use to read the data from the parquet file?
If I use "s3://" with file format "Parquet" as below:
CREATE EXTERNAL TABLE rp2 (id text, fname text, lname text, mname text) LOCATION ('s3://location.parquet config=./s3/s3.config')
I get the following error:
ERROR: unexpected end of file (seg0 slice1 IP:port pid=xxx)
If I go for gphdfs:// protocol as :
CREATE EXTERNAL TABLE rp2 (id text, fname text, lname text, mname text) LOCATION ('gphdfs:location.parquet config=./s3/s3.config') FORMAT 'PARQUET';
I get the following error:
ERROR: external table gphdfs protocol command ended with error. Exception in thread "main" java.lang.IllegalArgumentException: Illegal input uri: gphdfs://locs.parquet config=./s3/s3.config (seg0 slice1 IP:Port pid=pid)
Any help in this regard will be highly appreciated.