0

I have trained a model using AWS supplied semantic segmentation algorithm inside a notebook. Feeding images of size 512x512 to this network trained on images of the same size takes approximately 10 seconds. Feeding an image of size 1024x512 takes just about double the time.

This felt like an absurd amount of time so I dug deeper, loading the model into an EC2 instance using gluoncv and mxnet which AWS semantic segmentation is built upon.

Here I found a flag for ctx declaring if I want to use CPU or GPU. I have found this flag nowhere on AWS so my assumption was that this must be handled in the background depending on what instance I choose to run.

However, when loading my model trained on a notebook into a EC2 instance set up for GPU I get the following error: "RuntimeError: Parameter 'fcn0_resnetv1s_conv0_weight' was not initialized on context gpu(0). It was only initialized on [cpu(0)]."

Which I interpret as the network solely running on CPU and in turn explains why it takes 10 seconds to feed a 512x512 image through the network.

Am I missing something here? How do I get the AWS supplied semantic segmentation algorithm to run using GPU?

Regards, C

1 Answers1

0

According to its documentation SageMaker Semantic Segmentation supports both CPU and GPU for inference.

SageMaker built-in algorithm containers cannot be deployed in notebooks, they deploy only via Hosting Endpoints or Batch Transform. So if you want to deploy to GPU you need to specificy a GPU-enabled machine in your model.deploy() call or in the endpoint creation SDK call (if not using the Python SDK)

Some algorithms have reasonably transparent internals (like the Semantic Segmentation algo) which may enable you to read them offline, for example in a Notebook or in your custom environment.

In that case, in order to run GPU inference yourself you need to have both the models and inputs in the GPU context. To move the model to GPU you can use net.collect_params().reset_ctx(mxnet.gpu())

Olivier Cruchant
  • 3,747
  • 15
  • 18