image preprocess function for image_dataset_from_directory

Question

In the ImageDataGenerator, I've used the following function to preprocess images, through the keyword of 'preprocessing' in .flow_from_dataframe().

However, I am now trying to use the image_dataset_from_directory, which does not work with the preprocess function, as it does not allow embedding this function.

I've tried to apply the preprocess_image() function after the dataset is generated by image_dataset_from_directory, through .map() function, but it does not work either.

Please could anyone advise?

Many thanks, Tony

train_Gen = dataGen.flow_from_dataframe(
    df, 
    x_col='id_code',
    y_col='diagnosis',
    directory=os.path.join(data_dir, 'train_images'),
    batch_size=BATCH_SIZE, 
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    subset='training',
    seed=123,
    class_mode='categorical',
    **preprocessing=preprocess_image**,
)

def crop_image_from_gray(img, tol=7):
    """
    Applies masks to the orignal image and 
    returns the a preprocessed image with 
    3 channels
    
    :param img: A NumPy Array that will be cropped
    :param tol: The tolerance used for masking
    
    :return: A NumPy array containing the cropped image
    """
    # If for some reason we only have two channels
    if img.ndim == 2:
        mask = img > tol
        return img[np.ix_(mask.any(1),mask.any(0))]
    # If we have a normal RGB images
    elif img.ndim == 3:
        gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
        mask = gray_img > tol
        
        check_shape = img[:,:,0][np.ix_(mask.any(1),mask.any(0))].shape[0]
        if (check_shape == 0): # image is too dark so that we crop out everything,
            return img # return original image
        else:
            img1=img[:,:,0][np.ix_(mask.any(1),mask.any(0))]
            img2=img[:,:,1][np.ix_(mask.any(1),mask.any(0))]
            img3=img[:,:,2][np.ix_(mask.any(1),mask.any(0))]
            img = np.stack([img1,img2,img3],axis=-1)
        return img

def preprocess_image(image, sigmaX=10):
    """
    The whole preprocessing pipeline:
    1. Read in image
    2. Apply masks
    3. Resize image to desired size
    4. Add Gaussian noise to increase Robustness
    
    :param img: A NumPy Array that will be cropped
    :param sigmaX: Value used for add GaussianBlur to the image
    
    :return: A NumPy array containing the preprocessed image
    """
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = crop_image_from_gray(image)
    image = cv2.resize(image, (IMG_WIDTH, IMG_HEIGHT))
    image = cv2.addWeighted (image,4, cv2.GaussianBlur(image, (0,0) ,sigmaX), -4, 128)
    return image

As the snnipet code is part of a larger project, given the limited space here, I can only copy/paste these functions for visibility. — Tony, Sep 05 '22 at 13:50
I guess what I am asking is the generic solution to work with the preprocess_image(), which takes numpy array as input, and works with the ImageDataGenerator.flow_from_dataframe. But for image_dataset_from_directory, i seem to have to use .map(), which does not take numpy as input. This is where I don't really know how to handle — Tony, Sep 05 '22 at 13:53
Please provide enough code so others can better understand or reproduce the problem. — Community, Sep 06 '22 at 06:43
What if you try to use a custom Dataset function? That should do the work, you will just have to manually write the get item function. — Azhan Mohammed, Sep 09 '22 at 06:45

image preprocess function for image_dataset_from_directory

0 Answers0