0

i am trying to create realtime data ingestion to clickhouse hosted in ec2. For that my pipeline is

Eventbridge -> KinesisFirehose(destination http endpoint) -> lambda(function url) -> clickhouse HTTP endpoint.

Everything is working as expected in UAT. But the function url provided by lambda is public by default, which will ofcourse pose a security concern in prod. Is there any way to make firehose to lambda's http endpoint secure that the the lambda's http endpoint can only be invoked by firehose and the data cannot leave the aws account.

P.S : If there is any way to improve this pipeline, please post in the comments too. Will be helpfull.

Denny Crane
  • 11,574
  • 2
  • 19
  • 30
Mayank Pant
  • 145
  • 1
  • 8
  • Does this answer your question? [Invoke AWS Lambda Function URL from eventbridge api destination](https://stackoverflow.com/questions/71989528/invoke-aws-lambda-function-url-from-eventbridge-api-destination) – Jason Capriotti Jun 07 '22 at 21:48
  • Hey, actually i know about this. My question was how to make this connection secure such that the communication remains inside the account and service to service. I know it can be done from IAM rules but how ? Sorry i am new to IAM authentication. – Mayank Pant Jun 09 '22 at 09:03
  • when you say "remains inside the account" do you mean VPC? EventBridge runs outside the VPC, so it is starting from outside. Outside of that, I have a solution idea, but curious about that detail in your question. – Jason Capriotti Jun 09 '22 at 20:49
  • Yes it will be starting from outside because eventbridge cannot be put inside VPC but i think that communication is already secured, right ? We define target in eventbridge so that eventbridge only sends data to that target only. My concern is communication between kinesis and http endpoint. As the endpoint defined by lambda is public, how can i secure it ? – Mayank Pant Jun 10 '22 at 05:46
  • Also just curious is there any way i can directly communicate with clickhouse http endpoint from firehose ie without having lambda function url in between, just like eventbridge i dont think you can put firehose inside a vpc, so i am not sure how this communication will play in the secure enviroment. – Mayank Pant Jun 10 '22 at 05:57

2 Answers2

1

Based on the question's comments, this answer is mostly about adding authentication to that Lambda URL...

I don't think the Lambda URL will work being called from Firehose. The reason is if you're using IAM authorization (implied due to the security requirement) calling it requires the client to sign the API request. Firehose doesn't support that.

I'm not sure of the reason for Firehose, but I think you can remove that and then either call the Lambda directly from EventBridge, or put API Gateway in between EventBridge and the Lambda.

Calling the Lambda directly might be simpler, but then you lose the flexibility of having a web API. But security is easy, its handled by IAM roles.

API Gateway shouldn't be much more difficult, and I assume your Lambda already handles the payload (since that's what the Lambda function URL sends). That looks like this:

EventBridge -> API Gateway -> Lambda

The API Gateway would need either IAM or Cognito authorization:

  1. IAM would be easiest; the EventBridge rule can target the API Gateway directly and then the rule just needs to use the proper IAM role.
  2. Cognito would be more complicated, but the idea here would be to set up a user pool client that uses the client_credential flow. In EventBridge, you'd set up your target as a "EventBridge API destination" and use an authorization type of "OAuth Client Credentials".

You also mention the ClickHouse API, I imagine looking into that would be even simpler, depending on how much logic you have in the Lambda. It looks like they have an interface, so you'd then just need to use the "EventBridge API destination" and send to that. Your EC2 hosts would either need to be publicly accessible, or you might be able to proxy the request through API Gateway or something else.

Jason Capriotti
  • 1,836
  • 2
  • 17
  • 33
0

There is a trick with "Transform source records with AWS Lambda" configuration in Firehose stream.

Eventbridge -> KinesisFirehose(any destination e.g. S3) -> drop
                         |
                         +-> lambda("Transform source") -> anything you want

You can effectively call the Lambda function with the proper IAM Role, Lambda receives all the data from the Firehose. Your lambda should return Dropped to all records, see https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html

j123b567
  • 3,110
  • 1
  • 23
  • 32