1

Can SageMaker Training have training data in NVMe volumes on compatible instances? (eg G4dn and P3dn). If so, if there an appropriate way to programmatically access that data?

1 Answers1

1

Yes on all nitro-backed instances EBS volumes that are exposed as NVMe block devices.

In the Sagemaker Python SDK, you can specify the volume_size of the SM_TRAINING_CHANNEL path - the EBS (NVMe backed) will be in that path and when you go to actually run you pass the --train_dir path to your code.

Code example below:

def main(aws_region,s3_location,instance_cout):
    estimator = TensorFlow(
        train_instance_type='ml.p3.16xlarge',
            **train_volume_size=200,**
        train_instance_count=int(instance_count),
        framework_version='2.2',
            py_version='py3',
        image_name="231748552833.dkr.ecr.%s.amazonaws.com/sage-py3-tf-hvd:latest"%aws_region,

And then in your entry script

train_dir = os.environ.get('SM_CHANNEL_TRAIN')
subprocess.call(['python','-W ignore', 'deep-learning-models/legacy/models/resnet/tensorflow2/train_tf2_resnet.py', \
            "--data_dir=%s"%train_dir, \
juvchan
  • 6,113
  • 2
  • 22
  • 35