How to transform Byte[](decoded as PNG or JPG) to Tensorflows Tensor

Question

I'am trying to use Tensorflowsharp in a Project in Unity.

The problem i'm facing is that for the transform you usually use a second Graph to transform the input into a tensor. The used functions DecodeJpg and DecodePng are not supported on Android so how can you transform that input into a tensor ?

private static void ConstructGraphToNormalizeImage(out TFGraph graph, out TFOutput input, out TFOutput output, TFDataType destinationDataType = TFDataType.Float)
{

    const int W = 224;
    const int H = 224;
    const float Mean = 117;
    const float Scale = 1;
    graph = new TFGraph();
    input = graph.Placeholder(TFDataType.String);
    output = graph.Cast(graph.Div(
        x: graph.Sub(
            x: graph.ResizeBilinear(
                images: graph.ExpandDims(
                    input: graph.Cast(
                        graph.DecodeJpeg(contents: input, channels: 3), DstT: TFDataType.Float),
                    dim: graph.Const(0, "make_batch")),
                size: graph.Const(new int[] { W, H }, "size")),
            y: graph.Const(Mean, "mean")),
        y: graph.Const(Scale, "scale")), destinationDataType);
}

Other solutions seem to create non accurate results.

Maybe somehow with a Mat object?

and my EDIT: I implemented something comparabel in c# in Unity and it works partially. It is just not accurate at all. How am i gonna find out the Mean? And i could not find anything about the rgb order.? I'm really new to this so maybe i have just overlooked it. (on Tensorflow.org) Using MobileNet trained in 1.4.

  public TFTensor transformInput(Color32[] pic, int texturewidth, int textureheight)
    {
        const int W = 224;
        const int H = 224;
        const float imageMean = 128;
        const float imageStd = 128;

        float[] floatValues = new float[texturewidth * textureheight * 3];

        for (int i = 0; i < pic.Length; ++i)
        {
            var color = pic[i];
            var index = i * 3;

            floatValues[index] = (color.r - imageMean) / imageStd;
            floatValues[index + 1] = (color.g - imageMean) / imageStd;
            floatValues[index + 2] = (color.b - imageMean) / imageStd;

        }
        TFShape shape = new TFShape(1, W, H, 3);
        return TFTensor.FromBuffer(shape, floatValues, 0, floatValues.Length);
    }

Hi, can you give some more information on the network. Have you trained it by yourself or use a pre-trained network? I don't see any pre-trained MobilNets on tensorflow.org. — sladomic, Jan 10 '18 at 07:58
It is a retrained net using their script with the Flowers essentially (a few more pics and a "nothing" category to negate FP a little). In Python that thing runs like a charm. when i test on the same pictures in c# with that script it is not accurate (it is not accurate in any script outside of python). It has a 1.0 224 architecture. All Tensorflow instances seem to be some Version of 1.4. — Robert Kaa Frank, Jan 10 '18 at 09:37

score 5 · Accepted Answer · answered Jan 06 '18 at 16:41

Instead of feeding the byte array and then use DecodeJpeg, you could feed the actual float array, which you can get like this:

https://github.com/tensorflow/tensorflow/blob/3f4662e7ca8724f760db4a5ea6e241c99e66e588/tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowImageClassifier.java#L134

float[] floatValues = new float[inputSize * inputSize * 3];
int[] intValues = new int[inputSize * inputSize];

bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
      final int val = intValues[i];
      floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
      floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
      floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
}

Tensor<Float> input = Tensors.create(floatValues);

In order to use "Tensors.create()" you need to have at least Tensorflow version 1.4.

First thanks for the answer. I implemented something comparabel but the graph is pretty much predicting anything without a real accuracy. It's a MobilNet and i probably just have to find out which Mean and std i need? — Robert Kaa Frank, Jan 08 '18 at 18:07
God I had so many problems with decodejpeg, I can't believe that loading it manually was that simple. Thanks a lot!! — Gaspa79, Jul 28 '18 at 19:45

Andrei Ashikhmin · Answer 2 · 2018-01-29T05:57:06.953

2

You probably didn't crop and scale your image before putting it into @sladomic function.

I managed to hack together a sample of using TensorflowSharp in Unity for object classification. It works with model from official Tensorflow Android example, but also with my self-trained MobileNet model. All you need is to replace the model and set your mean and std, which in my case were all equal to 224.

edited Jan 29 '18 at 05:57

answered Jan 28 '18 at 17:45

Andrei Ashikhmin

2,401
2
20
34

thanks will look into it. I (later) cut out the middle of the picture essentially in the resolution needed by the NN. But it still was not accurate. I will look into what you did thanks. If i find the time i will Update my question with what i found out and maybe it works with what you did then. – Robert Kaa Frank Jan 28 '18 at 18:14

How to transform Byte[](decoded as PNG or JPG) to Tensorflows Tensor

2 Answers2

Linked