-1

I have a notebook on Sagemaker notebook instance that runs some logic and updated files on S3 accordingly. I would like to automate that notebook to run at scheduled times. I copied the script to S3 and created a lambda that runs the notebook saved on S3, then created an eventbridge rule that runs this lambda at a given schedule.

In my eventbridge rule, I specify target as lambda, then in additional settings I configure target input to constant(JSON text) where I specfify the S3 URI to my script.

Checking the logs, the eventbridge rule is invoked at the right time, the lambda is triggered yet the script does not seem to run correctly as it does not update files on S3 as expected. I have tested the script separately on sagemaker notebook and ensured it's running correctly and updating files on S3 as expected. So script running successfully without errors, eventbridge rule invoked on time, lambda triggered yet changes not made as expected .. what could possibly be going wrong here?

MSS
  • 35
  • 4
  • You are saying the lambda is working? If so: no need to post its code. If the lambda potentially has *any* problems please don't post minified / uglified / obfuscated code. – luk2302 May 01 '23 at 18:57
  • We'll need debugging details. The script, the lambda invocation and logs, anything would be helpful. You will probably find the information you need in the log output for the lambda. – theherk May 01 '23 at 19:22
  • Please post a [mre] so we can help you out. – Anon Coward May 01 '23 at 20:40
  • didn't post the code not to distract you according to @luk2302 suggestion as I know the problem is not with the code since the notebook is running successfully, the rule invoked and lambda triggered and the whole scenario has worked for multiple other scripts before so we know for sure the lambda code is working properly as well .. I assume there is probably a technical issue within the loop that I don't get .. so I need some AWS knowledge as of what might possibly cause such an issue as I have searched a lot but never got to the cause – MSS May 01 '23 at 22:21

2 Answers2

1

The simplest way to automate the jupyter notebook on your SageMaker Notebook instance would be to schedule a cronjob on the NB instance itself, you can do that either manually or via LCC (LifeCycle Configuration) script.

Another way to schedule notebook jobs is via the new feature of SageMaker Studio, which creates a Training job to execute the jupyter notebook at the scheduled times.

Lastly, if you have a Notebook instance up and running which contains the jupyter notebook or other scripts that you'd like to execute through Lambda, here is a solution I was able to test on my end by taking inspiration from this article.

I created a Lambda function with Python 3.9 in the same Private subnet (with NAT GW) and same auto-referencing SecurityGroup as the NB instance (so that Lambda can connect to NB instance). Lambda's execution role needs to have IAM permissions to create presigned url of your NB instance.

Lambda code to execute the jupyter notebook names automated-nb.ipynb inside a NB instance:

import boto3
import time
import requests
import asyncio
from websockets import connect

async def remote_script(notebook, cmd):
    sm_client = boto3.client('sagemaker')
    notebook_instance_name = notebook
    url = sm_client.create_presigned_notebook_instance_url(NotebookInstanceName=notebook_instance_name)['AuthorizedUrl']
    url_tokens = url.split('/')
    print(f"AuthorizedURL: '{url}'")
    http_proto = url_tokens[0]
    http_hn = url_tokens[2].split('?')[0].split('#')[0]
    s = requests.Session()
    r = s.get(url)
    cookies = "; ".join(f"{key}={value}" for key, value in s.cookies.items())
    print(f"Cookies: '{cookies}'")
    
    uri = f"wss://{http_hn}/terminals/websocket/1"
    print(f"URI: '{uri}'")
    
    try:
        async with connect(
            uri, 
            origin=http_proto + "//" + http_hn, 
            extra_headers={'Cookie': cookies, 'Host': http_hn}
        ) as websocket:
            print(f"Connected to WebSocket {uri}.")
            await websocket.send(cmd)
            time.sleep(1)
            websocket.close()
            print("WebSocket closed.")
    except Exception as e:
        print("Excpetion:")
        print(e)
        
def lambda_handler(event, context):
    print("About to execute `asyncio.run`...")
    asyncio.run(remote_script(notebook="<My_Notebook_instance>", cmd="""[ "stdin", "jupyter nbconvert --execute --to notebook --inplace /home/ec2-user/SageMaker/automated-nb.ipynb --ExecutePreprocessor.kernel_name=python3 --ExecutePreprocessor.timeout=1500\\r" ]"""))
    return None

Also, you'll need to install packages requests and websockets for the Lambda to work. You can follow the documentation here. Here is how I performed it on my end:

pip install --target ./packages requests websockets
cd packages
zip -r ../my-deployment-packages.zip .
cd ..
zip -g my-deployment-packages.zip lambda_function.py

aws lambda update-function-code --function-name <My_Lambda_Function> --zip-file fileb://my-deployment-packages.zip
Sandman
  • 111
  • 4
  • Nice! thanks for sharing @Sandman. One question though regarding the studio option, I haven't been able to find the create job option from my notebook.. do you know how to get to that option through sagemaker studio? – MSS May 03 '23 at 18:01
  • Hi @MSS, you need to click the "calendar" icon on the right side of "git" icon. I took a screenshot at this [link](https://imgur.com/a/zyEVVhq). Then you can check the notebook job runs and definitions from the left side menu item "Notebook jobs". – Sandman May 04 '23 at 00:17
  • thanks a lot @Sandmann for the detailed explanation .. weird though I don't have this icon .. potentially a version issue – MSS May 08 '23 at 14:22
0

Adding to @Sandmann's answer above, I was able to get to the cause of the problem when investigating output notebooks. It turns out there was some problem with dependencies which is why it is always good practice to set dependencies along with specified versions to mitigate such problems. But instead of having to manually check output files to check for errors, there is a more efficient way of doing it which is through setting a failed job alert through SNS, thus receiving a notification once a job fails. This way you'd ensure the job is running and don't have to check it every now and then nor investigate output notebooks for errors in cases of failure. The following link shows how to set an SNS alert for failed jobs. https://docs.aws.amazon.com/batch/latest/userguide/batch_sns_tutorial.html

MSS
  • 35
  • 4