2

I'm currently working on a text summarizer powered by the Huggingface transformers library. The summarization process has to be done on premise, as such I have the following code (close to documentation):

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
model = BartForConditionalGeneration.from_pretrained('sshleifer/distilbart-cnn-6-6')
tokenizer = BartTokenizer.from_pretrained('sshleifer/distilbart-cnn-6-6')

inputs = tokenizer([myTextToSummarize], max_length=1024, return_tensors='pt')
summary_ids = model.generate(inputs['input_ids'], num_beams=4, early_stopping=True)
[tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids]

My problem is that I cannot load the model in memory and have my server expose an API which can directly use model and tokenizer, I would like both of them to be initialized in a first process, and made available in a second one (one that will expose an HTTP API). I saw that you can export the model on the filesystem, but again, I don't have access to it (locked k8s environment), and I'd need to store it in a specific database.

Is it possible to export both the modeland the tokenizer as string/buffer/something storable in a Database ?

Thanks a lot

ovesco
  • 41
  • 3
  • I have never tried it, but maybe you can [pickle it](https://stackoverflow.com/questions/30469575/how-to-pickle-and-unpickle-to-portable-string-in-python-3). – cronoik Apr 26 '21 at 18:47
  • 1
    Didn't think about it I'm gonna give it a try – ovesco Apr 27 '21 at 10:57

0 Answers0