The simplest way to automate the jupyter notebook on your SageMaker Notebook instance would be to schedule a cronjob on the NB instance itself, you can do that either manually or via LCC (LifeCycle Configuration) script.
Another way to schedule notebook jobs is via the new feature of SageMaker Studio, which creates a Training job to execute the jupyter notebook at the scheduled times.
Lastly, if you have a Notebook instance up and running which contains the jupyter notebook or other scripts that you'd like to execute through Lambda, here is a solution I was able to test on my end by taking inspiration from this article.
I created a Lambda function with Python 3.9 in the same Private subnet (with NAT GW) and same auto-referencing SecurityGroup as the NB instance (so that Lambda can connect to NB instance).
Lambda's execution role needs to have IAM permissions to create presigned url of your NB instance.
Lambda code to execute the jupyter notebook names automated-nb.ipynb
inside a NB instance:
import boto3
import time
import requests
import asyncio
from websockets import connect
async def remote_script(notebook, cmd):
sm_client = boto3.client('sagemaker')
notebook_instance_name = notebook
url = sm_client.create_presigned_notebook_instance_url(NotebookInstanceName=notebook_instance_name)['AuthorizedUrl']
url_tokens = url.split('/')
print(f"AuthorizedURL: '{url}'")
http_proto = url_tokens[0]
http_hn = url_tokens[2].split('?')[0].split('#')[0]
s = requests.Session()
r = s.get(url)
cookies = "; ".join(f"{key}={value}" for key, value in s.cookies.items())
print(f"Cookies: '{cookies}'")
uri = f"wss://{http_hn}/terminals/websocket/1"
print(f"URI: '{uri}'")
try:
async with connect(
uri,
origin=http_proto + "//" + http_hn,
extra_headers={'Cookie': cookies, 'Host': http_hn}
) as websocket:
print(f"Connected to WebSocket {uri}.")
await websocket.send(cmd)
time.sleep(1)
websocket.close()
print("WebSocket closed.")
except Exception as e:
print("Excpetion:")
print(e)
def lambda_handler(event, context):
print("About to execute `asyncio.run`...")
asyncio.run(remote_script(notebook="<My_Notebook_instance>", cmd="""[ "stdin", "jupyter nbconvert --execute --to notebook --inplace /home/ec2-user/SageMaker/automated-nb.ipynb --ExecutePreprocessor.kernel_name=python3 --ExecutePreprocessor.timeout=1500\\r" ]"""))
return None
Also, you'll need to install packages requests
and websockets
for the Lambda to work. You can follow the documentation here.
Here is how I performed it on my end:
pip install --target ./packages requests websockets
cd packages
zip -r ../my-deployment-packages.zip .
cd ..
zip -g my-deployment-packages.zip lambda_function.py
aws lambda update-function-code --function-name <My_Lambda_Function> --zip-file fileb://my-deployment-packages.zip