0

I'm trying to host PaddleOCR model on AWS SageMaker endpoint. I've tried with 2 configurations:

I'm trying 2 different configurations:

  1. Normal (large):

  2. Slim:

Normal (1) model is successfully deployed. Inference time is ~1.5s. Slim (2) model is successfully deployed, but when it's invoked, the inference runs for a very long and fails after 60s.

System configuration:

  • ml.t2.medium Sagemaker instance
  • container 763104351884.dkr.ecr.us-east-2.amazonaws.com/pytorch-inference:2.0.1-cpu-py310-ubuntu20.04-sagemaker
  • Paddle: 2.5.0
  • PaddleOCR: 2.6.0.1

Why isn't Slim model able to run?

Alcibiades
  • 335
  • 5
  • 16
  • 2 comments - 1. What is the local inference time before moving to SageMaker? 2. RealTime endpoints have a timeout of 60s, if your requirement is higher check out Async Inference whose timeout is 60mins. 3. Try to use the latest DLCs here - https://github.com/aws/deep-learning-containers/blob/master/available_images.md – Raghu Ramesha Jul 12 '23 at 21:41
  • 1. < 1s 2. My requirement is that the model finishes in <2s. Therefore I'm using the larger model, as it somehow runs faster 3. I'm already using the latest DLC (as of 07/17/2023) – Alcibiades Jul 17 '23 at 16:32

0 Answers0