0

I have been trying to understand how to load an onnx model in C++ using visual studio and provide input to it and see how and what the output of onnx model, But I dont find any way or how to load an input to onnx model

This is the latest Code:

#include<GLFW/glfw3.h>
#define STB_IMAGE_IMPLEMENTATION
#include "stb_image.h"

#include <iostream>
#include <onnxruntime_cxx_api.h>
#include<dml_provider_factory.h>
#include <vector>

std::vector<float> load_image_and_preprocess(std::string& filename, int& x, int& y) 
{
    std::cout << x<<y<<filename<<std::endl;
    x = 0;
    y = 0;
    std::vector<float> ret;
    int n;
    unsigned char* data = stbi_load(filename.c_str(), &x, &y, &n, 3);
    ret.resize(x * y * 3);
    float* rptr = &ret[0 * x * y], * gptr = &ret[1 * x * y], * bptr = &ret[2 * x * y];
    for (int i = 0; i < x * y; i++) {
        *rptr++ = (float(data[3 * i + 0]) / 255 - 0.485) / 0.229;
        *gptr++ = (float(data[3 * i + 1]) / 255 - 0.456) / 0.224;
        *bptr++ = (float(data[3 * i + 2]) / 255 - 0.406) / 0.225;
    }
    stbi_image_free(data);
    return ret;
}


 
int main()
{ 
    const std::string model_s = "C:/Downloads/resnet18-v1-7.tar/resnet18-v1-7/resnet18-v1-7.onnx";
    std::basic_string<ORTCHAR_T> model = std::basic_string<ORTCHAR_T>(model_s.begin(), model_s.end());
    std::cout << model.c_str();
    // onnxruntime setup
    Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "batch-model-explorer");
    Ort::SessionOptions session_options;
    OrtSessionOptionsAppendExecutionProvider_DML(session_options, 0);
    Ort::Session session = Ort::Session(env, model.c_str(), session_options);

    std::cout << "number of model input:" << session.GetInputCount() << std::endl;
    std::cout << "number of model Output:" << session.GetOutputCount() << std::endl;

    Ort::AllocatorWithDefaultOptions allocator;

    auto val = session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape().size();
    auto val1 = session.GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape().size();
    std::cout << "number of model input:" << val << std::endl;
    std::cout << "number of model Output:" << val1 << std::endl;

    auto input_names = session.GetInputNameAllocated(0, allocator);
    auto output_names = session.GetOutputNameAllocated(0, allocator);
    std::cout << "Val:" << input_names << std::endl;
    std::cout << "Val:" << output_names << std::endl;


    std::vector<std::vector<int64_t>> input_node_dims(1);
    int i = 0;
    input_node_dims[0] = session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape();

    for (int j = 0; j < input_node_dims[i].size(); j++)
        printf("Input %d : dim %d=%jd\n", i, j, input_node_dims[i][j]);
    
    int x = 0, y = 0;
    std::string sVal = "C:/Desktop/Image.png";
    std::vector<float> vecVal = load_image_and_preprocess(sVal, x, y);
    std::cout << vecVal.size() << std::endl;
    for (int j = 0; j < vecVal.size(); j++)
        std::cout << vecVal[j] << std::endl;
    
    constexpr size_t input_tensor_size = 224 * 224 * 3;
    auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
    
    auto input_tensor = Ort::Value::CreateTensor<float>(memory_info, vecVal.data(), input_tensor_size,input_node_dims.data(), 4);
    return 0;
}

Kindly help me out in understanding how to load and provide input to onnx model

Ron
  • 57
  • 1
  • 6
  • 2
    That entirely depends on the way the model is defined. Machine learning models tend to want a fixed-size image of reduced size (eg 200x200), often with color on separate planes (YUV? RGB?). You should study the `session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()` expression carefully along with any documentation or example code that came with the model – Botje Mar 16 '23 at 10:46
  • @Botje can you provide some example or some links where I can further gain some knowledge about it, and I tried to see the shape value but I couldnt – Ron Mar 16 '23 at 10:55
  • There is some documentation, [probably from the place where you got the model from](https://github.com/onnx/models/tree/main/vision/classification/resnet#inference). You should edit your question with a more detailed question once you understand the documentation and the steps you need to take. – Botje Mar 16 '23 at 10:59
  • @Botje the model I got is for python, but I wanted to access it in C++ , for which I don't find any documentation, correct me if I am wrong – Ron Mar 16 '23 at 11:29
  • @kiner_shah thanks for the link, but I was able to load the model, Now I am struck from giving input to the model, as far as I understood I should provide a 2D image as an input, but I am struck that how to read the image and in which format I should pass the input to the model – Ron Mar 16 '23 at 11:54
  • The input format is basically shape of the input layer and the data type (float vs uint32). So you have to read the image somehow and transform it as per the input format. You can read the image via OpenCV or some other library. Or maybe you can refer to that link for giving random input. – kiner_shah Mar 16 '23 at 11:57

1 Answers1

0

The documentation that accompanies this model is clear about the input expectations:

Input

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and W are expected to be at least 224. The inference was done using jpeg image.

Preprocessing

The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. The transformation should preferably happen at preprocessing.

You can use the venerable stbi_image to load your file from disk:

#define STB_IMAGE_IMPLEMENTATION
#include "stb_image.h"

std::vector<float> load_image_and_preprocess(const std::string& filename, int& x, int& y) {
  std::vector<float> ret;
  int n;
  unsigned char *data = stbi_load(filename.c_str(), &x, &y, &n, 3);
  ret.resize(x * y * 3);
  float *rptr = &ret[0 * x * y], *gptr = &ret[1 * x * y], *bptr = &ret[2 * x * y];
  for (int i = 0; i < x * y; i++) {
    *rptr++ = (float(data[3 * i + 0])/255 - 0.485) / 0.229;
    *gptr++ = (float(data[3 * i + 1])/255 - 0.456) / 0.224;
    *bptr++ = (float(data[3 * i + 2])/255 - 0.406) / 0.225;
  }
  stbi_image_free(data);
  return ret;
}

This code separates the red, green and blue channels as your model seems to expect. This results in a preprocessed vector<float> you can pass to Ort::Value::CreateTensor. Going from there you should probably crib from the squeezenet example

Botje
  • 26,269
  • 3
  • 31
  • 41
  • Thanks for the code, can you please explain what are the values to be passed for x,y. I tried passing the height and width of the image but still facing issue – Ron Mar 17 '23 at 06:24
  • And I want to use directML as the provider will there be anything which can help me understand that – Ron Mar 17 '23 at 06:42
  • `stbi_load` fills in `x` and `y` for you so you don't have to figure them out in advance. – Botje Mar 17 '23 at 08:34
  • oh ok got it thanks for responding, I am facing another issue now when I try to use "Ort::Value::CreateTensor" I get an error saying that "no instance of overloaded function matches the argument list " – Ron Mar 17 '23 at 08:42
  • Impossible to answer without seeing the full code and error message. – Botje Mar 17 '23 at 08:44
  • actually this question is being closed automatically as duplicate, I have requested for reopening of the question, and cause of this I couldnt add the code – Ron Mar 17 '23 at 09:04
  • Your follow-up question has diverged from the original question anyway. Ask a new question, include the code you have now, and reference this one. – Botje Mar 17 '23 at 09:05
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/252565/discussion-between-ron-and-botje). – Ron Mar 17 '23 at 09:06
  • I have replaced the code with the new one in the first question which I have asked in the same thread, kindly check it, as stackover flow has restricted me from asking any further questions – Ron Mar 17 '23 at 09:09