2

I'm hoping to read a multi-file dataset from a public s3 bucket using the R package. Is it possible to not sign requests for objects like these? In the AWS CLI, that's done with the --no-sign-request argument. Is there any way that this can be done?

library(arrow)

open_dataset(
  sources = "s3://ookla-open-data/parquet/performance/",
  partitioning = c("type", "year", "quarter"),
  hive_style = TRUE,
  format = "parquet"
)
#> Error: IOError: When getting information for key 'parquet/performance' in bucket 'ookla-open-data': AWS Error [code 100]: No response body.

Created on 2022-06-22 by the reprex package (v2.0.1)

1 Answers1

1

How about specifying sources = s3_bucket("ookla-open-data/parquet/performance", anonymous = TRUE)?

s3_bucket() accepts that and other configuration options for the S3 connection; see https://arrow.apache.org/docs/r/reference/FileSystem.html for a full list. Some of those options can be encoded in the S3 URI, but not all.

Neal Richardson
  • 792
  • 3
  • 3