I'm trying to get at the Common Crawl news S3 bucket, but I keep getting a "fatal error: Unable to locate credentials" message. Any suggestions for how to get around this? As far as I was aware Common Crawl doesn't even require credentials?
Asked
Active
Viewed 410 times
1 Answers
2
From News Dataset Available – Common Crawl:
You can access the data even without a AWS account by adding the command-line option
--no-sign-request
.
I tested this by launching a new Amazon EC2 instance (without an IAM role) and issuing the command:
aws s3 ls s3://commoncrawl/crawl-data/CC-NEWS/
It gave me the error: Unable to locate credentials
I then ran it with the additional parameter:
aws s3 ls s3://commoncrawl/crawl-data/CC-NEWS/ --no-sign-request
It successfully listed the directories.

John Rotenstein
- 241,921
- 22
- 380
- 470