0

I want to run a SQL query using AWS data pipeline. I have read the SQL activity info on their support page.

I am getting the error message:

Object:DefaultSqlActivity1 WARNING: Invalid role: 'DataPipelineDefaultRole'. Please confirm AWS IAM Role provided has suggested permissions.

Even after I have given my DataPipelineDefaultRole and DataPipelineDefaultResourceRole full access to S3, EC2, Redshift, DataPipeline AND done everything specified on this article: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-iam-roles.html

I am simply trying to run a SQL script on my amazon redshift on an hourly basis. I have been working to resolve any issues for some time now. I tried to create a role that had all privilages for everything but even that didn't work as it got stuck on "WAITING FOR RUNNER".

Any advice greatly appreciated. The .sql file is stored on s3 and I am using the script uri field under sqlactivity to run it. My pipeline currently looks like this: https://i.stack.imgur.com/SDyQy.jpg

Any help greatly appreciated!

Berra2k
  • 318
  • 2
  • 5
  • 16
  • I also get this warning, although my pipeline completes OK. That said, I had a lot of trouble getting it working, and it took a day's effort. The eventual consistency of the dashboard, the fact that it takes almost 10 minutes to start an EC2 node to run a query against Redshift that takes only 10ms and many other issues. I can only suggest patience and checking the messages you eventually get on the dependency panel. That'll tell you if it can't connect to Redshift. And make sure you select explicit subnet, AMI, security group, instance type. – PEELY Mar 02 '17 at 16:24

1 Answers1

1

You can either run the Sql query on an Ec2 resource launched/managed by DataPipeline ('RunsOn' points to an Ec2Resource) or run in it in a resource managed by you ( 'WorkerGroup' points to a moniker, which is used when you launch TaskRunner on your resource).

This is a Sample Pipeline to run a sql query: https://github.com/awslabs/data-pipeline-samples/blob/master/samples/SQLActivityWithTimeout/pipeline.json

AravindR
  • 677
  • 4
  • 11
  • So lets say I save my SQL script in S3 like they did in that example. How would I actually run it? I'm using the pipeline builder but don't understand where that code would go... – Berra2k Apr 19 '16 at 18:01
  • Do I have to create a resource on EC2 before I try my pipeline? And also it seems like my default pipeline doesnt have access to my AWS and I have no idea why. – Berra2k Apr 20 '16 at 17:01