2

I am trying to build and deploy a simple neural network in MXNet and deploy it on a server using mxnet-model-server.

The biggest issue is to deploy the model - model server crashes after uploading the .mar file but I have no idea what the problem could be.

I used the following code to create a custom (but very simple) neural network for testing:

from __future__ import print_function
import numpy as np
import mxnet as mx
from mxnet import nd, autograd, gluon

data_ctx = mx.cpu()
model_ctx = mx.cpu()

# fix the seed
np.random.seed(42)
mx.random.seed(42)

num_examples = 1000

X = mx.random.uniform(shape=(num_examples, 49))
y = mx.random.uniform(shape=(num_examples, 1))
dataset_train = mx.gluon.data.dataset.ArrayDataset(X, y)

dataset_test = dataset_train

data_loader_train = mx.gluon.data.DataLoader(dataset_train, batch_size=25)
data_loader_test = mx.gluon.data.DataLoader(dataset_test, batch_size=25)

num_outputs = 2
net = gluon.nn.HybridSequential()
net.hybridize()
with net.name_scope():
    net.add(gluon.nn.Dense(49, activation="relu"))
    net.add(gluon.nn.Dense(64, activation="relu"))
    net.add(gluon.nn.Dense(num_outputs))

net.collect_params().initialize(mx.init.Normal(sigma=.1), ctx=model_ctx)
softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .01})

epochs = 1
smoothing_constant = .01

for e in range(epochs):
    cumulative_loss = 0
    for i, (data, label) in enumerate(data_loader_train):
        data = data.as_in_context(model_ctx).reshape((-1, 49))
        label = label.as_in_context(model_ctx)
        with autograd.record():
            output = net(data)
            loss = softmax_cross_entropy(output, label)
        loss.backward()
        trainer.step(data.shape[0])
        cumulative_loss += nd.sum(loss).asscalar()

Following, exported the model using:

net.export("model_files/my_project")

The result are a .json and .params file.

I created a signature.json

{
  "inputs": [
    {
      "data_name": "data",
      "data_shape": [
        1,
        49
      ]
    }
  ]
}

The model handler is the same from the mxnet tutorial:

# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#     http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
ModelHandler defines a base model handler.
"""
import logging
import time


class ModelHandler(object):
    """
    A base Model handler implementation.
    """

    def __init__(self):
        self.error = None
        self._context = None
        self._batch_size = 0
        self.initialized = False

    def initialize(self, context):
        """
        Initialize model. This will be called during model loading time

        :param context: Initial context contains model server system properties.
        :return:
        """
        self._context = context
        self._batch_size = context.system_properties["batch_size"]
        self.initialized = True

    def preprocess(self, batch):
        """
        Transform raw input into model input data.

        :param batch: list of raw requests, should match batch size
        :return: list of preprocessed model input data
        """
        assert self._batch_size == len(batch), "Invalid input batch size: {}".format(len(batch))
        return None

    def inference(self, model_input):
        """
        Internal inference methods

        :param model_input: transformed model input data
        :return: list of inference output in NDArray
        """
        return None

    def postprocess(self, inference_output):
        """
        Return predict result in batch.

        :param inference_output: list of inference output
        :return: list of predict results
        """
        return ["OK"] * self._batch_size

    def handle(self, data, context):
        """
        Custom service entry point function.

        :param data: list of objects, raw input from request
        :param context: model server context
        :return: list of outputs to be send back to client
        """
        self.error = None  # reset earlier errors

        try:
            preprocess_start = time.time()
            data = self.preprocess(data)
            inference_start = time.time()
            data = self.inference(data)
            postprocess_start = time.time()
            data = self.postprocess(data)
            end_time = time.time()

            metrics = context.metrics
            metrics.add_time("PreprocessTime", round((inference_start - preprocess_start) * 1000, 2))
            metrics.add_time("InferenceTime", round((postprocess_start - inference_start) * 1000, 2))
            metrics.add_time("PostprocessTime", round((end_time - postprocess_start) * 1000, 2))

            return data
        except Exception as e:
            logging.error(e, exc_info=True)
            request_processor = context.request_processor
            request_processor.report_status(500, "Unknown inference error")
            return [str(e)] * self._batch_size

Following, I created the .mar file using:

model-archiver --model-name my_project --model-path my_project --handler ssd_service:handle

Starting the model on the server:

mxnet-model-server --start --model_store my_project --models ssd=my_project.mar

I literally followed every tutorial on: https://github.com/awslabs/mxnet-model-server

However, the server is crashing. The worker die, backend worker die, workers are disconnected, Load model failed: ssd, error: worker died

I have absolutely no clue what to do so I would be very glad if you helped me out!

Best

Ralf Sürig
  • 115
  • 1
  • 4

1 Answers1

1

I tried out your code and it works fine on my laptop. If I run: curl -X POST http://127.0.0.1:8080/predictions/ssd -F "data=[0 1 2 3 4]", I get: OK%

I can only guess why it doesn't work on your machine:

  1. Notice that model-store argument should be written with - not with _ as it is in your example. My command to run mxnet-model-server looks like this: mxnet-model-server --start --model-store ./ --models ssd=my_project.mar

  2. Which version of mxnet-model-server you use? The latest is 1.0.2, but I have 1.0.1 installed, so maybe you want to downgrade and try it out: pip install mxnet-model-server==1.0.1.

  3. Same question to MXNet version. In my case I use nightly build which I get via pip install mxnet --pre. I see that your model is very basic, so it shouldn't depend much... Nevertheless, install the 1.4.0 (current one) just in case.

Not sure, but hope it will help you.

Sergei
  • 1,617
  • 15
  • 31