I have written a python script that connects to my AWS RDS database extracts some data, performs some webscraping tasks and then imports the results back to AWS RDS. Currently this works perfectly fine locally but I would now like to upload the code to AWS Lambda and have it run daily.
I acknowledge that there is a lot of information already online for doing such a thing, maybe even too much information that it can be difficult to find the exact solution easily.
What I have done so far:
- I have created the AWS RDS database (Fully Working)
- I have created an AWS Lambda function
- Uploaded the script and associated libraries in a Zip via AWS S3
- Maxed out the timeout (5mins)
I have tested the above with a basic script and that works.
When I use the main script (will show some code shortly) I get the following error:
START RequestId: XXXXX Version: $LATEST
module initialization error: 2003: Can't connect to MySQL server on 'XXXXX. XXXXX.us-east-2.rds.amazonaws.com:3306' (110 Connection timed out)
END RequestId: XXXXX
REPORT RequestId: XXXXX Duration: 140196.56 ms Billed Duration: 140200 ms Memory Size: 256 MB Max Memory Used: 55 MB
module initialization error
2003: Can't connect to MySQL server on 'XXXXX.XXXXX.us-east-2.rds.amazonaws.com:3306' (110 Connection timed out)
In my code I connect to the AWS Database in two ways:
The first uses mysql-connect and this is used to retrieve a whole dataset to process:
import mysql.connector
cnx = mysql.connector.connect(user='XXXXX', password='XXXXX',
host='XXXXX.XXXXX.us-east-2.rds.amazonaws.com',
port= 3306,
database='XXXXX',
use_unicode=True)
cursor=cnx.cursor(buffered=True)
df = pd.read_sql('SELECT * FROM table’, con=cnx)
Again this^ code does work locally.
The second way that I connect to the AWS RDS database is for when I insert the results into the table:
from sqlalchemy import create_engine
engine = create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
format('XXXXX', 'XXXXX',
'XXXXX. XXXXX.us-east-2.rds.amazonaws.com:3306', 'XXXXX'))
df.to_sql(con=engine, name=’table’, if_exists='append',index=False)
Again^ this also works locally but doesnt seem to work via AWS Lambda
I am very new to working with AWS, there does seem to be a lot of different features and options and I apologise if I have missed something obvious. If there are some steps of options I need to enable then please let me know. Any help would be appreciated.
Extra info:
- I am using Python 3.6
- I am using the free tier of AWS