How do you run a half float ONNX model using ONNXRuntime C API?

Question

Since the C language doesn't have a half float implementation, how do you send data to the ONNXRuntime C API?

score 2 · Accepted Answer · answered Jun 03 '20 at 01:10

2

There's possibly an example you can follow linked from here: https://github.com/microsoft/onnxruntime/issues/1173#issuecomment-501088662

You can create a buffer to write the input data to using CreateTensorAsOrtValue, and access the buffer within the OrtValue using GetTensorMutableData.

ONNXRuntime is using Eigen to convert a float into the 16 bit value that you could write to that buffer.

uint16_t floatToHalf(float f) {
  return Eigen::half_impl::float_to_half_rtne(f).x;
}

Alternatively you could edit the model to add a Cast node from float32 to float16 so that the model takes float32 as input.

answered Jun 03 '20 at 01:10

Scott McKay

190
1
8

Thank you very much for your suggestion. I will try to add a Cast node. Can you perhaps point me to an example or a tutorial on how to edit ONNX models? Thank you. – katrasnikj Jun 18 '20 at 13:50
Here's example python that shows adding a Cast node to replace an original graph input. https://gist.github.com/skottmckay/32ea04dc0232c31d22a0eb80025e0dfe – Scott McKay Jun 25 '20 at 04:24

score 1 · Answer 2 · answered Apr 28 '20 at 12:51

1

the C language doesn't have a half float implementation

Yes, but there are language extensions and you can write your own library to handle the data.

So, for example there is _Float16 type defined by ISO/IEC TS 18661-3:2015 supported by gcc on some architectures.

And you can write or find a library that will handle the half-floating point operations.

answered Apr 28 '20 at 12:51

KamilCuk

120,984
8
59
111

Yes, I can try to use an implementation from github. However if ONNXRuntime is already using some kind of half float implementation wouldn't it be simpler to use the implementation they are using? Due to poor ONNXRuntime documentation I'm not able to figure out how to do this. – katrasnikj Apr 28 '20 at 13:00

How do you run a half float ONNX model using ONNXRuntime C API?

2 Answers2