15

I am trying to use an AWS Glue crawler on an S3 bucket to populate a Glue database. I run the Create Crawler wizard, select my datasource (the S3 bucket with the avro files), have it create the IAM role, and run it, and I get the following error:

Database does not exist or principal is not authorized to create tables. (Database name: zzz-db, Table name: avroavro_all) (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: 78fc18e4-c383-11e9-a86f-736a16f57a42). For more information, see Setting up IAM Permissions in the Developer Guide (http://docs.aws.amazon.com/glue/latest/dg/getting-started-access.html).

I tried to create this table in a new blank database (as opposed to an existing one with tables), I tried prefixing the names, I tried sourcing different schemas, and I tried using an existing role with Admin access. I though the latter would work, but I keep getting the same error, and have no idea why.

To be explicit, the service role I created has several policies I assume a premissive enough to create tables:

enter image description here

The logs are vanilla:


19:52:52
[10cb3191-9785-49dc-8935-fb02dcbd69a3] BENCHMARK : Running Start Crawl for Crawler avro
19:53:22
[10cb3191-9785-49dc-8935-fb02dcbd69a3] BENCHMARK : Classification complete, writing results to database zzz-db
19:53:22
[10cb3191-9785-49dc-8935-fb02dcbd69a3] INFO : Crawler configured with SchemaChangePolicy {"UpdateBehavior":"UPDATE_IN_DATABASE","DeleteBehavior":"DEPRECATE_IN_DATABASE"}.
19:53:34
[10cb3191-9785-49dc-8935-fb02dcbd69a3] ERROR : Insufficient Lake Formation permission(s) on s3://zzz-data/avro-all/ (Database name: zzz-db, Table name: avroavro_all) (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: 31481e7e-c384-11e9-a6e1-e78dc8223fae). For more information, see Setting up IAM Permissions in the Developer Guide (http://docs.aws.amazon.com/glu
19:54:44
[10cb3191-9785-49dc-8935-fb02dcbd69a3] BENCHMARK : Crawler has finished running and is in state READY
mhamrah
  • 9,038
  • 4
  • 24
  • 22

4 Answers4

18

I had the same problem when I setup and ran a new AWS crawler after enabling Lake Formation (in the same AWS account). I've been running Glue crawler for a long time and was stumped when I saw this new error.

After some trial and error, I found that the root cause of the problem is when you enable Lake Formation, it adds an additional layer of permission on new Glue database(s) that are created via Glue Crawler and to any resource (Glue catalog, S3, etc) that you add it to the Lake Formation service.

To fix this problem, you have to grant the Crawler's IAM role, a proper set of Lake Formation permissions (CRUD) for the database.

You can manage these permissions in AWS Lake Formation console (UI) under the Permissions > Data permissions section or via awscli lake formation commands.

Jay
  • 1,022
  • 10
  • 10
12

I solved this problem by adding a grant in AWS Lake Formations -> Permissions -> Data locations. (Do not forget to add a forward slash (/) behind the bucket name)

Grant for data location

David Webster
  • 2,208
  • 1
  • 16
  • 27
  • Thanks for this addition. Without the location permissions, it didn't work for me. – Guy Nov 03 '19 at 11:35
  • That does not seem to work for. I added all IAM permission I could find and granted data location access. Still no success. – n3rd Aug 06 '21 at 01:45
3

I had to add the custom role I created for Glue to the "Data lake Administrators" grantees:

enter image description here

(Note: just saying this solves the crawler's denied access. There may be something with lesser privileges to do...)

Ariel
  • 5,752
  • 5
  • 49
  • 59
  • Thank you for this screenshot and explanation! I was literally running into an AWS Lake Formation access error for 1 week before adding the relevant role to the "Data lake administrator" list and resolving the issue! (On an AWS account without Lake Formation enabled, there wasn't an issue. ) – xke May 01 '23 at 23:09
0

Make sure you gave the necessary permissions to your crawler's IAM role in this path:

Lake Formation -> Permissions -> Data lake permissions

(Grant related Glue Database permissions to your crawler's IAM role)

Farbod Ahmadian
  • 728
  • 6
  • 18