-1

The problem:

I'm trying to deploy an application that uses machine learning over an infrastructure (namely, Deta Space [https://deta.space/]). However, the resource limit they make available is around 250MB and this has been a problem for me. To run a digit classifier model (trained on the MNIST base), I used Scikit-learn (a MLP network and a SGD model), in its latest version. When I created a virtual environment, I noticed that it, including its dependencies, consumed about 248 MB. Therefore, when I try to deploy my application in Deta Space, it generates an error that exceeds the space limit since, in addition to Sciki-learn, I will have to use Flask, among others. Now, would you have any suggestions on how to tackle this problem?

Attempts made so far (and without success):

1. Make a version of the classifier model with Tensorflow and use Tensor flow Lite.

Results:

  • The amount of resources (i.e. libraries and dependencies installed in the environment) falls below the 250 MB limit. 
  • The Deta Space backend generates the following error:    Error Type: ImportError
    Error Message: /lib64/libm.so.6: version GLIBC_2.27 not found (required by /opt/python/tflite_runtime/_pywrap_tensorflow_interpreter_wrapper.so)
      Note: I researched the subject and tried to downgrade Tensorflow Lite, but the problem persists. I believe they don't provide any glib support that Tensorflow Lite needs.  

2. I tried to export/transpile the model from Sciki-learn to JavaScript, using the scikit-porter package [https://pypi.org/project/sklearn-porter/]. 

Results:   

  • The library is outdated and has errors when using the generated model. The official, unmaintained Github has been around for quite some time. 

Attempts in progress: 

  • Evaluate the possibility of using a specific implementation (without any framework but on Python) for the classifier model.

The challenge will be to generate a python environment with less than 250 MB that allows running models made in python with Sciki-learn or Tensor-flow.

I appreciate anyone who can collaborate!

Student
  • 49
  • 3

1 Answers1

0

250 MB simply isn't enough for almost any machine learning application. Try to add more resources.

Paul Brit
  • 5,901
  • 4
  • 22
  • 23
  • When we try to run applications with ML on a cell phone, or environments with limited resources, the problem of scarce resources can be circumvented by changing the tools used (e.g. using TensorFlow Lite). Many times, even in interesting solutions, we just need to make a "predict" of an already trained model. The question is more along these lines. If you wanted to embed a ML model on a 250 MB chip, what would you do? What would be an architecture/tools proposal? – Student Jul 21 '23 at 16:03