0

I'm trying to translate a Python script using onnxruntime to Rust using tract_onnx. The specific POC I'm trying to implement is the rothe_vgg.py script from the ONNX Model Zoo. This script uses three models:

For now, I'm trying just the first model to detect faces. I can get the example Python code to work:

face_detector_onnx = "models/version-RFB-320.onnx"
face_detector = ort.InferenceSession(face_detector_onnx)

def faceDetector(orig_image, threshold = 0.7):
    image = cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (320, 240))
    image_mean = np.array([127, 127, 127])
    image = (image - image_mean) / 128
    image = np.transpose(image, [2, 0, 1])
    image = np.expand_dims(image, axis=0)
    image = image.astype(np.float32)

    input_name = face_detector.get_inputs()[0].name
    confidences, boxes = face_detector.run(None, {input_name: image})
    boxes, labels, probs = predict(orig_image.shape[1], orig_image.shape[0], confidences, boxes, threshold)
    return boxes, labels, probs

I'm basing my tract_onnx translation on the onnx-mobilenet-v2 example. My version currently looks like this:

let model = onnx()
        .model_for_path("version-RFB-320.onnx")?
        .with_input_fact(
            0,
            InferenceFact::dt_shape(f32::datum_type(), tvec!(1, 3, 240, 320)),
        )?
        .into_optimized()?
        .into_runnable()?;

let image = image::open("bruce.jpg").unwrap().to_rgb8();
let resized = image::imageops::resize(&image, 240, 320, ::image::imageops::FilterType::Triangle);

let image: Tensor = tract_ndarray::Array4::from_shape_fn((1, 3, 240, 320), |(_, c, y, x)| {
    resized[(x as _, y as _)][c] as f32 / 255.0
}).into();

let result = model.run(tvec!(image))?;

I'm running into an issue with the translation of the resized image into a tensor:

thread 'main' panicked at 'Image index (240, 0) out of bounds (240, 320)'.

Is this an issue of not having the right dimensions or the right ordering of each dimension? Am I missing something?

I know I haven't yet implemented the other translations, which are my next questions: how can I properly normalize with image_mean, transpose, and expand dimensionality?

user655321
  • 1,572
  • 2
  • 16
  • 33
  • Are you sure that the order of the dimensions is the same when manipulating images and arrays? Eg. are you sure that images are not `width, height` and arrays `rows, columns`? – Jmb May 20 '21 at 07:07

1 Answers1

0

Check out the project https://github.com/recoai/visual-search. It implements several deep learning models in Rust using ONNX. You can configure any deep learning model image transformations or use 4 preconfigured models.

Here is an example how to configure the model. The project doesn't support segmentation models but the image transformation pipeline is working.

let model_config = ModelConfig {
    model_name: "SqueezeNet".into(),
    model_url: "https://github.com/onnx/models/raw/master/vision/classification/squeezenet/model/squeezenet1.1-7.onnx".into(),
    image_transformation: TransformationPipeline {
        steps: vec![
            ResizeRGBImageAspectRatio { image_size: ImageSize { width: 224, height: 224 }, scale: 87.5, filter: FilterType::Nearest }.into(),
            CenterCrop { crop_size: ImageSize {width: 224, height: 224} }.into(),
            ToArray {}.into(),
            Normalization { sub: [0.485, 0.456, 0.406], div: [0.229, 0.224, 0.225], zeroone: true }.into(),
            ToTensor {}.into(),
        ]
    },
    image_size: ImageSize { width: 224, height: 224 },
    layer_name: Some("squeezenet0_pool3_fwd".to_string()),
    channels: Channels::CWH
}
Pawel
  • 637
  • 8
  • 14