3

I'am trying to use Tensorflowsharp in a Project in Unity.

The problem i'm facing is that for the transform you usually use a second Graph to transform the input into a tensor. The used functions DecodeJpg and DecodePng are not supported on Android so how can you transform that input into a tensor ?

private static void ConstructGraphToNormalizeImage(out TFGraph graph, out TFOutput input, out TFOutput output, TFDataType destinationDataType = TFDataType.Float)
{

    const int W = 224;
    const int H = 224;
    const float Mean = 117;
    const float Scale = 1;
    graph = new TFGraph();
    input = graph.Placeholder(TFDataType.String);
    output = graph.Cast(graph.Div(
        x: graph.Sub(
            x: graph.ResizeBilinear(
                images: graph.ExpandDims(
                    input: graph.Cast(
                        graph.DecodeJpeg(contents: input, channels: 3), DstT: TFDataType.Float),
                    dim: graph.Const(0, "make_batch")),
                size: graph.Const(new int[] { W, H }, "size")),
            y: graph.Const(Mean, "mean")),
        y: graph.Const(Scale, "scale")), destinationDataType);
}

Other solutions seem to create non accurate results.

Maybe somehow with a Mat object?

and my EDIT: I implemented something comparabel in c# in Unity and it works partially. It is just not accurate at all. How am i gonna find out the Mean? And i could not find anything about the rgb order.? I'm really new to this so maybe i have just overlooked it. (on Tensorflow.org) Using MobileNet trained in 1.4.

  public TFTensor transformInput(Color32[] pic, int texturewidth, int textureheight)
    {
        const int W = 224;
        const int H = 224;
        const float imageMean = 128;
        const float imageStd = 128;

        float[] floatValues = new float[texturewidth * textureheight * 3];

        for (int i = 0; i < pic.Length; ++i)
        {
            var color = pic[i];
            var index = i * 3;

            floatValues[index] = (color.r - imageMean) / imageStd;
            floatValues[index + 1] = (color.g - imageMean) / imageStd;
            floatValues[index + 2] = (color.b - imageMean) / imageStd;

        }
        TFShape shape = new TFShape(1, W, H, 3);
        return TFTensor.FromBuffer(shape, floatValues, 0, floatValues.Length);
    }
Mike Wise
  • 22,131
  • 8
  • 81
  • 104
  • Hi, can you give some more information on the network. Have you trained it by yourself or use a pre-trained network? I don't see any pre-trained MobilNets on tensorflow.org. – sladomic Jan 10 '18 at 07:58
  • It is a retrained net using their script with the Flowers essentially (a few more pics and a "nothing" category to negate FP a little). In Python that thing runs like a charm. when i test on the same pictures in c# with that script it is not accurate (it is not accurate in any script outside of python). It has a 1.0 224 architecture. All Tensorflow instances seem to be some Version of 1.4. – Robert Kaa Frank Jan 10 '18 at 09:37

2 Answers2

5

Instead of feeding the byte array and then use DecodeJpeg, you could feed the actual float array, which you can get like this:

https://github.com/tensorflow/tensorflow/blob/3f4662e7ca8724f760db4a5ea6e241c99e66e588/tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowImageClassifier.java#L134

float[] floatValues = new float[inputSize * inputSize * 3];
int[] intValues = new int[inputSize * inputSize];

bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
      final int val = intValues[i];
      floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
      floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
      floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
}

Tensor<Float> input = Tensors.create(floatValues);

In order to use "Tensors.create()" you need to have at least Tensorflow version 1.4.

sladomic
  • 876
  • 1
  • 8
  • 11
  • First thanks for the answer. I implemented something comparabel but the graph is pretty much predicting anything without a real accuracy. It's a MobilNet and i probably just have to find out which Mean and std i need? – Robert Kaa Frank Jan 08 '18 at 18:07
  • God I had so many problems with decodejpeg, I can't believe that loading it manually was that simple. Thanks a lot!! – Gaspa79 Jul 28 '18 at 19:45
2

You probably didn't crop and scale your image before putting it into @sladomic function.

I managed to hack together a sample of using TensorflowSharp in Unity for object classification. It works with model from official Tensorflow Android example, but also with my self-trained MobileNet model. All you need is to replace the model and set your mean and std, which in my case were all equal to 224.

Andrei Ashikhmin
  • 2,401
  • 2
  • 20
  • 34
  • thanks will look into it. I (later) cut out the middle of the picture essentially in the resolution needed by the NN. But it still was not accurate. I will look into what you did thanks. If i find the time i will Update my question with what i found out and maybe it works with what you did then. – Robert Kaa Frank Jan 28 '18 at 18:14