0

I am testing polars implementation in R and I want to read and process (lazily) a parquet file from a public AWS S3 bucket.

After loading

library(aws.s3)
library(arrow)
library(polars)

I tried to open a connection using arrow::s3_bucket(url) where url is the location of the parquet file: s3://rimrep-data-public/091-aims-sst/test-50-64-spatialpart

tempBucket <- s3_bucket(bucket = 's3://rimrep-data-public/091-aims-sst/test-50-64-spatialpart')
df <- scan_parquet(tempBucket)

resulted in

Error in new_from_parquet(path = file, n_rows = n_rows, cache = cache,  : not a string object

Any suggestions will be really appreciated.

Cheers, Eduardo

UPDATE: I tried to read using the S3 URI directly:

df <- scan_parquet('s3://rimrep-data-public/091-aims-sst/test-50-64-spatialpart')

and get:

thread '<unnamed>' panicked at 'One or more of the cloud storage features ('aws', 'gcp', ...) must be enabled.'
ekleins
  • 11
  • 3

0 Answers0