-4

Pls help, how this can be achieved?

Requirement:

When new files are available in a AWS S3 Bucket, a lambda process will be triggered and Pentaho job(s) to validate/process the files should be triggered.

The Pentaho Job should be executed in the server and not the Lambda JVM (to make use of the resources of the Linux Server where Pentaho 7.1 Client community version is available.)


Note: I followed the approach in https://dankeeley.wordpress.com/2017/04/25/serverless-aws-pdi/ and this executes the code in Lambda JVM, but per our reqmt we need the job to run in the linux server.

Infra Details:

Pentaho Code will be in file repo in server; mount location example: /mnt/data

Pentaho version: Pentaho 7.1 Client community version.

Server: Linux

Thanks in advance.

1 Answers1

1

If you want your Pentaho Job to be executed in the server and not the Lambda JVM, you dont need AWS Lambda at all.

Instead you can use

  1. AWS SNS and
  2. Provision an HTTP endpoint on your Linux Server which then subscribes to SNS topic

Basically you will need to install an HTTP Server and provision an HTTP endpoint that can be invoked when new files are available in S3. So when new files are available in a AWS S3 Bucket, you can set the notification to AWS SNS instead of AWS Lambda and then as a subscriber to this SNS Topic you can hook in the HTTP endpoint that you provisioned in step 2 above.

So whenever a new file is invoked a notification will go to SNS which in turn will push that to HTTP endpoint and then you can read the file and execute your Pentaho Job

Arafat Nalkhande
  • 11,078
  • 9
  • 39
  • 63