0

assuming I have AWS accounts:

  1. DEV (where data scientists use SageMaker notebooks/studio to actively explore data and develop models)
  2. Test (where the model monitor happens)
  3. Prod (where the accepted model is hosted)

My question is, from engineering perspective, for the DEV env mentioned above, because data scientists need it for their work, so it actually can be treated like a production-level account right? Because if ML engineers also actively trying out/test new features or resources in this env, it might affect data scientist's work. Is it a good practice to have a separate dev/test accounts for this DEV env?

I couldn't find any architect design pattern like this online, can someone advice please?

wawawa
  • 2,835
  • 6
  • 44
  • 105

1 Answers1

1

if you are referring to ML (platform) engineers setting up the environments and creating/configuring the resources that Data Scientists use for developing ML models, a good practice is:

  • ML platform engineers developing the infrastructure they need to provision with IaC approach (e.g. using CloudFormation or CDK) in a separate AWS account, let's call it Governance Account
  • Here, they also develop the automations that are necessary to provision the resources that are needed to setup the platform (e.g. IAM policies, Roles, CI/CD pipelines, etc.) to target accounts (Dev, Test, Prod, ..)
  • As a consequence, the target AWS accounts become a variable of this process; for example, if a business unit requires to work in a segregated environment, they will provision 3 (or more) accounts and run the provisioning automation. If engineers need to test a new setup (e.g. how to enable/configure a new SageMaker feature), they can run the automations on separate accounts that they use to test their developments.

You can find additional guidance in this blog post: https://aws.amazon.com/blogs/machine-learning/mlops-foundation-roadmap-for-enterprises-with-amazon-sagemaker/

Hope this helps.