how to guarantee aws lambda caller that provided data remains secure

Question

This question has some similarities to Disable AWS Lambda Environment Variables in its overall purpose, but is directed primarily at network access.

I'd like to provide the ability for a third party to invoke my Lambda function. The third party will submit their own data to the Lambda function (either through the payload, or by specifying the data location e.g. an S3 bucket).

I'd like for the Lambda service to be able to guarantee the third party that the data they provided has not been leaked from the Lambda process to anywhere else. In order to do this, at the very least, the third party must have assurance that the Lambda function has not connected to some other resource on the Internet and leaked the data to it.

Assuming that

I am providing the code that will operate on the sensitive data
the third party has no way to inspect that code, and
the third party trusts Amazon, but does not trust me

Is there any way to achieve this with Lambda (perhaps in conjunction with other AWS products)? I have looked into solutions using Gateways, EC2, encryption at rest, S3, and custom permissioning with all of those, but have found no solution.

As noted in an answer below, a Lambda function inside VPC can only access the Internet if you configure it so that it can, but since *you* control the environment, and *you* control the code, this still serves no purpose toward "proving" that the information wasn't leaked. In fact, there is never a way for you to prove to me that you didn't export/expose/duplicate my data unless you've never had access to it. Otherwise, how can there be a technical solution? The only "proofs" possible would be trusted 3rd party audits. — Michael - sqlbot, May 03 '17 at 15:45
I agree with your assessment that the below answer does not solve the problem due to my control over the environment. However, I am not sure why a technical solution cannot exist if the trusted 3rd party provides the platform on which the code runs (and if it is assumed that the trusted party always keeps its promise), to ensure that, at the request of the caller, the code is run in a restricted environment. — mwag, May 03 '17 at 15:56
btw in my above comment when I say "the trusted 3rd party" I mean AWS (as opposed to the "third party" that is providing the sensitive data and running the code and who needs the assurance of data security) — mwag, May 03 '17 at 18:18

score 1 · Answer 1 · answered May 03 '17 at 15:18

1

you could create the lambda method inside a VPC and secure it. read this https://forums.aws.amazon.com/thread.jspa?messageID=733719 they were having problems because they couldn't access internet

answered May 03 '17 at 15:18

UXDart

2,500
14
12

As @Michael points out in his comment on the OP, this approach does not solve the problem, because it would still require that the third party trusts me not to leak the data (inadvertently or not), since I could control whether / how the VPC is set up. – mwag May 03 '17 at 18:20
1

Can't you give them the code and a script that setups the lambda configuration and they run on it their VPC? Which they can trust? – johni May 03 '17 at 20:43
Not ideal as I'd be giving them code, which makes the solution hard to maintain, but that might nonetheless be the best solution currently possible. +1 – mwag May 04 '17 at 15:03

score 1 · Answer 2 · answered May 03 '17 at 23:45

The only "proof" you can offer would be in the form of an external trusted third party (not AWS) that has audited your environment, practices, and policies, and is assured to their satisfaction that you are capable of properly handling the sensitive data.

The client is – for whatever reason – unwilling to trust you to process their data without simultaneously also skimming some of it without authorization... meanwhile you presumably don't want to simply license your code to them so they can run it in an environment that is secure to their satisfaction.

Neither of those things is a technical issue. Those are trust issues.

I'm not entirely sure that you've considered what a truly isolated environment would really mean. Obviously you wouldn't be able to consult any databases... but also, it would necessarily be stripped of any logging functionality. console.log(data.super_secret);. Logs from Lambda functions leave the Lambda environment and fly to CloudWatch.

Assuming for whatever reason you remain unconvinced, there's always DNS Tunneling. The beauty of this evil scheme is that you are never isolated from DNS resolution in VPC, even when you don't have Internet. The DNS resolver at 169.254.169.253 is always there, listening helpfully, immune to security groups, immune to Network ACLs, immune to the default route. You want to stealthily smuggle data out of an "isolated" environment? Done.

In any event, AWS doesn't make assurances as to the security of your configuration -- they only assume responsibility for securing their infrastructure itself. They assure you that it is as secure as you have configured it to be... but how you configure it is up to you. They call it the Shared Security Model:

While AWS manages security of the cloud, security in the cloud is the responsibility of the customer. Customers retain control of what security they choose to implement to protect their own content, platform, applications, systems and networks, no differently than they would for applications in an on-site datacenter.

https://aws.amazon.com/compliance/shared-responsibility-model/

I've upvoted because you raise some valid points (e.g. console.log). Still, am not sure why it is hard to see why a client would not trust-- many banks and hospitals don't even trust AWS or Google yet, so when banks and hospitals are the client providing the sensitive data, it's still a leap of faith to assume they will eventually be comfortable trusting AWS / Google / Azure. But that's a leap that is much easier to take (hence the willingness to make that assumption here) than if it were for any company (eg mine) other than those behemoths. — mwag, May 04 '17 at 02:39
Yes, I understand it is a trust issue-- the purpose of this question is to determine whether that trust issue, between A and B, can be mitigated with a technical solution involving a mutually trusted party C-- similar to how SSL certs use a third party (the certificate authority) to broker a level of trust that isn't possible without the trusted third party. — mwag, May 04 '17 at 02:42
I didn't intend to say or imply it was hard to see why a client wouldn't trust a vendor. *For whatever reason* was intended as a placeholder for everything from "vendor seems sketchy" to "the client has an unrealistically high estimate of their data's actual value" to "change is hard" and points in between. — Michael - sqlbot, May 04 '17 at 09:48

how to guarantee aws lambda caller that provided data remains secure

2 Answers2