In a nutshell: What is the best way to give and control end user access to files stored in a S3 bucket with specific access rules determined for each files by which “group” the end user belong to and what is his role in that “group”, when there is a lot of dynamically defined “group” (more than 100 000) and each user can be part of several “groups” (more than 1000).
I am in a team where we are developing a product based on AWS lambda, accessible with a web app. The product is developed using micro service architecture. To explain our use case, let's imagine we have 3 micro services:
- User service, that is in fact AWS Cognito (handle user and authorization in the whole platform)
- Company service. Developed by us, based on AWS Lambda and dynamoDB. That manage company information (name, people, and other metadata that I will not explain here)
- Document service. This service, that we need to develop, need to handle documents that belongs to a company.
In terms of architecture, we have some difficulty to handle the following use case:
We would like that people that belong to one or multiple companies can have access to that documents (files). These people may have some role inside the company (Executive, HR, Sales). Depending of these roles, people may have access to only a subpart of company documents. Of course, people that do not belong to a company will not have access to that company documents.
To handle such use cases, we would like to use AWS S3, and if possible, without redeveloping our own micro service that may proxify AWS S3.
The problem is: How we can manage rights with AWS S3 for our use case ?
We have investigated multiple solutions.
Using IAM policies that restrict S3 file access (the WEB app access S3 directly, no proxy). If our S3 bucket is organized by company name/UUID (folders at the root dir of S3), we can think about creating an IAM policy every time we create a company and configure it so that every user in a company have access to the company folder, and only that folder.
Create a bucket for each company is not possible because AWS limit the number of S3 bucket to 100 (or 1000) per AWS account. And our product may have more than 1000 companies
Putting user in group (group == 1 company) is not possible because the number of groups per user pool is 500.
Using lamda@edge that proxify AWS S3 call to verify that file URI in S3 is authorized for the requested user (user belongs to the company and have the right roles to read its documents). This Lambda@edge will call an internal service to know if this user is authorized to get files from this company (based on the called URL)
Using AWS S3 Pre Signed URL. We can create our own document-service, that expose CREATE, GET, DELETE api, that will contact AWS S3 service after having done authorization checking (user belongs to the company) and generate pre signed URL to upload or get a file. Then the user (WebApp) will call S3 directly.
In fact, If I try to summarize our problem, we have some difficulties to handle a mix of RBAC and authorization control inside an AWS product developed with AWS lambda, and exposing AWS S3 to end user.
If you have experience or recommendation for this kind of use case, you advice will be very welcome.