0

I have a small problem with Java and machine learning. I have trained a model with Keras and it works as expected when I use Python to predict images.

The shape on which the model is trained was [ width, height, RGB ].

But when I load an image in Java I got [ RGB, width, height] - so I try to use .reshape() to change the shape but I clearly mess there something up because all predictions are wrong afterwards:

ResizeImageTransform rit = new ResizeImageTransform(128, 128);
NativeImageLoader loader = new NativeImageLoader(128, 128, 3, rit);

INDArray features = loader.asMatrix(f); // GIVES ME A SHAPE OF 1, 3, 128, 128
features = features.reshape(1, 128, 128, 3); // GIVES ME THE SHAPE 1, 128, 128, 3 AS NEEDED

INDArray[] prediction = model.output(features); // all predictions wrong

I am no Java developer and I try to get alon with the documentation but here I clearly overlook something. Maybe someone here can give a tip what I am doing wrong...

  • Could you print out the model summary summary? Normally we expect an NCHW image. Different frameworks will have different layouts. Keras uses NHWC by default. Edit: If there's a bug we're also happy to look at it over at https://github.com/deeplearning4j/deeplearning4j/issues - I highly doubt there is though. That workflow is fairly standard and has worked for many users for a while now. – Adam Gibson Mar 24 '23 at 10:39
  • Hi. Thanks - that helped me to find a solution... After adding features = features.permute(0, 2, 3, 1); the model works but instead of flagging 195 images it only flaggs 136. So there is still a issue.... – Markus Bauer Mar 24 '23 at 11:05

4 Answers4

0

So now I get at least 136 images of my test-set flagged. The Python version flags 195 images...

So I guess the normalisation is a problem. I train the model with:

train = ImageDataGenerator(rotation_range=5, horizontal_flip=True, vertical_flip=True, rescale=1/255)

And I use

X *= 1/255

before the prediction in the test script.

In Java I use

features = features.permute(0, 2, 3, 1);
DataNormalization scalar = new ImagePreProcessingScaler(0, 1);
scalar.transform(features);

but I am not sure if the normalisation is the issue or if I have srewed up the parameters for .permute()...

Any suggestions?

  • Your generator looks fine and the only scaling you should need to do is the rescaling for inference. It's kind of hard to know what else the issue is without looking at this closer. I also don't know what your *= 1/255 is supposed to be. The normalization should already be done with rescale. Could you please post your full pipeline? – Adam Gibson Mar 24 '23 at 11:31
  • see my 2nd answer. – Markus Bauer Mar 24 '23 at 12:10
  • What versions are you using? I'm trying to reproduce your issue here. – Adam Gibson Mar 24 '23 at 12:59
  • I need to get to bed soon but please feel free to file an issue: https://github.com/deeplearning4j/deeplearning4j/issues if there is something else going on. Please ensure you are upgraded to the latest version and give both your keras and dl4j versions. Thanks! – Adam Gibson Mar 24 '23 at 13:19
  • While I'm at it I'd need to preferrably see the dataset. If you could give me a minibatch from each one or a reproducer dataset I can try to fix the more exact issue to see if there's anything with compatibility. I'm thinking it could either be your version or maybe the way the updater/optimizer is improted. – Adam Gibson Mar 24 '23 at 13:29
  • I am sorry - I can't hand out anything from the dataset. This is an image recogition plugin for forensics software... So that dataset contain a lot of stuff I cound not hand out legally. I just see the order of values is off... – Markus Bauer Mar 24 '23 at 14:07
  • Oh I was thinking of Nd4j.rand for image sizes. I don't want your secret sauce just something that looks similar :) It looks like you solved your issue. Sorry about that. – Adam Gibson Mar 24 '23 at 22:03
0

That's all how the model is trained:

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.applications import MobileNetV2

from tensorflow.keras.applications import ResNet152V2

# GENERAL WIDTH AND HIGHT FOR THE IMAGES
WIDTH = 128
HEIGHT = 128

train = ImageDataGenerator(rotation_range=5, horizontal_flip=True, vertical_flip=True, rescale=1/255) 
valid = ImageDataGenerator(rescale=1/255)

train_set = train.flow_from_directory('images_train/', target_size=(WIDTH,HEIGHT), batch_size=64, class_mode='categorical')
test_set  = valid.flow_from_directory('images_test/', target_size=(WIDTH,HEIGHT), batch_size=64, class_mode='categorical')

resnet = ResNet50V2(include_top=False, weights='imagenet', input_shape=(WIDTH,HEIGHT,3))  
    
for layer in resnet.layers:
    layer.trainable = False

x = tf.keras.layers.Flatten()(resnet.output)
x = tf.keras.layers.Dense(512, activation='relu')(x)
n_classes = len(train_set.class_indices)
predictions = tf.keras.layers.Dense(n_classes, activation='softmax')(x)

model = tf.keras.Model(inputs=resnet.input, outputs=predictions)
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])

hist = model.fit(train_set, epochs=20, validation_data=test_set)

model.save('resnet50v2.h5')

That's the code how I test images in Python:

ctr = 0
for root, dirs, files in os.walk(base_path):
    for name in files:
        image_path = os.path.join(root, name)
        
        tmp = image_path.lower().split(".")
        if tmp[-1] in ["jpg", "jpeg", "png", "bmp"]:
            orig_image = Image.open(image_path)
            if orig_image.mode != "RGB":
                orig_image = orig_image.convert("RGB")
                
            image = orig_image.resize((128, 128))
            
            X = []
            X.append(np.array(image.getdata()).reshape((128,128,3)))
            X = np.array(X).astype('float64')
            X *= 1/255

            # PREDICT AND WRITE REPORT
            pred = model.predict(X)
            pred = np.rint(pred).astype("int32")

            if(pred[0][1] != 1):
                ctr += 1
                print(f"{ctr} :: {image_path} == {pred[0]}")

And that's the code how I test images in Java:

int ctr = 0;
for (File f : listOfFiles) {
    if (f.isFile()) {
        ResizeImageTransform rit = new ResizeImageTransform(128, 128);
        NativeImageLoader loader = new NativeImageLoader(128, 128, 3, rit);
        INDArray features = null;
        try{
            features = loader.asMatrix(f); // GIVES ME A SHAPE OF 1, 3, 128, 128
        } 
        catch(IOException ex){
            continue;
        }
        features = features.permute(0, 2, 3, 1);
        DataNormalization scalar = new ImagePreProcessingScaler(0, 1);
        scalar.transform(features);

        INDArray[] prediction = model.output(features);

        // Get Class
        double pred[] = prediction[0].toDoubleVector();
        int predClass = 0;
        for(int i = 0; i < pred.length; i++){
            predClass = pred[i] > pred[predClass] ? i : predClass;
        }

        if(predClass != 1){
            ctr++;
            System.out.println(f.getName());
            System.out.println(ctr + ") PORN FOUND :: " + predClass);
        }
    }
}
0

Python

DSC_3767.jpg
    A          B          C
[[[[0.9254902  0.88627451 0.87843137]
   [0.9254902  0.88627451 0.87843137]
   [0.9254902  0.88627451 0.87843137]
   ...

Java

DSC_3767.jpg
        C          B          A
[[[[    0.8784,    0.8863,    0.9255], 
   [    0.8784,    0.8863,    0.9255], 
   [    0.8784,    0.8863,    0.9255],
   ...

All I have to do is swap C and A and the model will work fine. I just don't get the way how.

  • Could you clarify what you did? You had to permute the final output? It might have something to do with the image layout again. If you get time I'd love to take a deeper look in to this to make sure the developer experience is better. Please do file an issue so we can discuss it. That way if you use this for other models we make sure you have a good experience. As mentioned above no secret sauce :) just Nd4j.rand is fine. – Adam Gibson Mar 24 '23 at 22:04
  • The array under Python is the way the model was trained. That is what I get after reading the images in Python. Java gives me after features = features.permute(0, 2, 3, 1); the order you see above. So here is A and C the wrong way arround. If I swap them I would input exactly the same data into the model in Python and Java. But I don't get it how to swap A and C or how to change the permut() parameters to get the right order... – Markus Bauer Mar 25 '23 at 06:22
0

I mean I can fix the issue with

for (int y = 0; y < 128; y++) {
    for (int x = 0; x < 128; x++) {
        double a = features.getDouble(0,y,x,0);
        double b = features.getDouble(0,y,x,2);
        features.putScalar(new int[] {0,y,x,0}, b);
        features.putScalar(new int[] {0,y,x,2}, a);
    }
}

... but there must be a nicer / better solution.