Questions tagged [torchserve]
39 questions
3
votes
2 answers
NVIDIA Triton vs TorchServe for SageMaker Inference
NVIDIA Triton vs TorchServe for SageMaker inference? When to recommend each?
Both are modern, production grade inference servers. TorchServe is the DLC default inference server for PyTorch models. Triton is also supported for PyTorch inference on…

juvchan
- 6,113
- 2
- 22
- 35
1
vote
1 answer
How to create handler for huggingface model deployment using torchserve
I'm attempting to serve a pretrained huggingface model with torchserve and i've managed to save the model as a torchscript file (.pt). However, I do not know what the handler would look like for such a model. This seems to be a requirement for the…

maxwellspi
- 11
- 2
1
vote
1 answer
What the purpose of creating Python class inherited from `abc.ABC` but without `abstractmethod`?
I've read TorchServe's default handlers' sources and found that the BaseHandler is inherited from abc.ABC and doesn't have any abstract method. The VisionHandler is the same.
What could be the reason and when I should use abc.ABC without…

feeeper
- 2,865
- 4
- 28
- 42
1
vote
0 answers
Torchserve metrics on prometheus using kubernetes
I have a torchserve service running on kubernetes and I am already able to track metrics with it on port 8082. My problem is that from the kubernetes pod I can see it logs hardware metrics like:
[INFO ] pool-3-thread-2 TS_METRICS -…

Prosciutt0
- 55
- 8
1
vote
1 answer
TorchServe is best practice for Vertex AI or overhead?
Currently, I am working with a PyTorch model locally using the following code:
from transformers import pipeline
classify_model = pipeline("zero-shot-classification", model='models/zero_shot_4.7.0', device=device)
result = classify_model(text,…

Роман Сергеевич
- 65
- 4
- 12
1
vote
1 answer
Torchserve streaming of inference responses with gRPC
I am trying to send a singular request to a Torchserve server and retrieve a stream of responses. The processing of the request takes some time and I would like to receive intermeddiate updates over the course of the run. I am quite new to…

P_Andre
- 730
- 6
- 17
1
vote
1 answer
Google Vertex AI Prediction: Why is TorchServe showing 0 GPUs?
I have deployed a trained PyTorch model to a Google Vertex AI Prediction endpoint. The endpoint is working fine, giving me predictions, but when I examine its logs in Logs Explorer, I see:
INFO 2023-01-11T10:34:53.270885171Z Number of GPUs: 0
INFO…

urig
- 16,016
- 26
- 115
- 184
1
vote
0 answers
kserve updating from 0.7 to 0.9. My .mar file works on 0.7 but not on 0.9. Was able to run the example without issue on 0.9
I have been tasked with updating kserve from 0.7 to 0.9. Our company mar files run fine on 0.7 but when I update to kserve 0.9 the pods are brought up without issue. However, when I when a request is sent it returns a 500 error. The logs are given…

Waqas Shah
- 11
- 1
1
vote
1 answer
Extremely slow Bert inference on TorchServe for random requests
I have deployed Bert Hugging Face models via TorchServe on the AWS EC2 GPU instance.
There are enough resources provisioned, usage of everything is consistently below 50%.
TorchServe performs inference on Bert models quickly, most of the time below…

sereneSentry
- 29
- 6
1
vote
1 answer
Why doesn't this python aiohttp requests code run asynchronously?
I'm trying to access an API with aiohttp but something is causing this code to block each iteration.
def main():
async with aiohttp.ClientSession() as session:
for i, (image, target) in enumerate(dataset_val):
image_bytes =…

Terv
- 11
- 1
1
vote
1 answer
How do I create a custom handler in torchserve?
I am trying to create a custom handler on Torchserve.
The custom handler has been modified as follows
# custom handler file
# model_handler.py
"""
ModelHandler defines a custom model handler.
"""
import os
import soundfile
from…

Takayama-Shin
- 13
- 4
1
vote
1 answer
Logging in Custom Handler for TorchServe
I have written a custom handler for an DL model using torch-serve and am trying to understand how to add manual log messages to the handler. I know that I can simply print any messages and it will show them within the MODEL_LOG logger at level…

cotrane
- 149
- 4
1
vote
0 answers
Containerized Torchserve worker downloads new serialized file on start up
I am trying to build a container running torchserve with the pretrained fast-rcnn model for object detection in a all-in-one Dockerfile, based on this…

VEHC
- 11
- 3
1
vote
1 answer
TorchServe: How to convert bytes output to tensors
I have a model that is served using TorchServe. I'm communicating with the TorchServe server using gRPC. The final postprocess method of the custom handler defined returns a list which is converted into bytes for transfer over the network.
The post…

Mohit Motwani
- 4,662
- 3
- 17
- 45
0
votes
0 answers
Weird Python Error Occurring Inside of Torch Serving Library
I am using BASE_IMAGE=ubuntu:22.04 which comes with python3.10.
Then I proceed to install all my required dependencies, and am able to start the application and listen to the port 8080 correctly.
The Pytorch Server starts the initialization phase…

gggggggggggggggg
- 9
- 2