0

I’m trying to deploy my backend on Heroku and running into the 500 MB slug size limit because my code downloads two tokenizers from Huggingface. For reference, the two tokenizers are BertTokenizerFast.from_pretrained('bert-base-uncased') and SentenceTransformer('multi-qa-MiniLM-L6-cos-v1').

My requirements.txt file contains the following packages:

fastapi
transformers[torch]
sentence-transformers
requests
uvicorn

Huggingface has an inference API for models but it does not seem to work for tokenizers. What is a good way to structure my architecture to get around the slug size limit? One thing I thought of is creating 2 seperate FastApi apps to serve the output from each tokenizer but I am wondering if there is a better way to do it.

0 Answers0