0

I am trying to do image processing on my dataset. The dataset is divided into 346 folders according to the following manner

What I want to do is loop over the 346. Enter each folder and process the images within Process the image in regards to changing it to gray scale, resize and normalize (these three steps should be applied to my whole dataset. I want to keep the same folders/files names and dont change it when I run my data processing.

The folder/ files name are as follows video_0001/ 00000.png, 00001.png, ..... video_0002/ 00000.png, 00001.png,

The number of files vary according to each folder and the last video_0346

P.S when I try to normalize the images I get black images when dividing by 255

I am still new to python. Here's what I was able to accomplish

I appreciate your help


Img_height = 512
Img_width = 512
srcdir = "C:\\Users\\Desktop\\_dataset"

for subdir, dirs, files in os.walk(srcdir): 
     for i in range(346):
          for file in os.listdir(subdir):
               print(file )
               img = cv2.imread(os.path.join(srcdir,file))
               print("image", img)
               gray_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
               Image_sample_resized = resize(gray_img, (height, width))
               plt.imshow( Image_sample_resized, cmap = "gray")
               
     i = i+1 
Lolo
  • 31
  • 6

1 Answers1

0

I was still trying to understand how you wanted to save you formated images but the methods below just save them to a new dir that you specify so you can batch them while training if you want.

import os

import cv2
import matplotlib.image as mpimg


def resize_image(image, size=(512, 512)):
    """
    Resize an image to a fixed size
    """
    return cv2.resize(image, size)


def convert_to_grayscale(image):
    """
    Convert an image to grayscale
    """
    return cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)


def normalize_image(image):
    """
    Normalize an image
    """
    return cv2.normalize(image, None, 0, 255, cv2.NORM_MINMAX)


def save_image(image, directory, filename):
    """
    Save an image to a new directory
    """
    cv2.imwrite(os.path.join(directory, filename), image)


def main():
    """
    Main function
    """
    # Get the directory to process
    directory = input("Enter the directory to process: ")

    # Get the directory to save the images
    save_directory = input("Enter the directory to save the images: ")

    # Size to resize the images
    img_width, img_height = 512, 512

    # Iterate through the directory
    for root, _, files in os.walk(directory):
        for file in files:
            # Get the filepath
            filepath = os.path.join(root, file)

            # Get the filename
            filename = os.path.basename(filepath)

            # Get the extension
            extension = os.path.splitext(filepath)[1]

            # Check if the file is an image
            if extension in [".jpg", ".jpeg", ".png"]:
                # Load the image
                image = mpimg.imread(filepath)

                # Resize the image
                image = resize_image(image, (img_width, img_height))

                # Convert the image to grayscale
                image = convert_to_grayscale(image)

                # Normalize the image
                image = normalize_image(image)

                # Save the image
                save_image(image, save_directory, filename)


if __name__ == "__main__":
    main()
PCDSandwichMan
  • 1,964
  • 1
  • 12
  • 23
  • Thank you for your help. My main issue, when running the code it just loops over 1 folder video_0001 and save the files to saved folder only and doesnot loop over other files. Also when checking if the images were preprocessed, I find that the .shape attribute provides with (512,512, 3), even though the images in the folder are grayscale and when printing the image to check for normalization, I find that was not also applied providing a numpy array with values such as [[[210 210 210] [210 210 210] [208 208 208] ..... – Lolo May 08 '22 at 00:33
  • what I want to do is Go to dataset folder > video_0001 > process all images (I dont know how many files in the folder so I am assuming that I need to use len function, once done with the preprocess of video_0001 go to the next subfolder which is video_0002 and do the same until all folders are done. In my saved folder I want the same folders to be saved as the source folders with the same name convention, so In my saved folder I should have the preprocessed images in a folder called video_0001. Any thoughts on this problem @PCDSandwichMan – Lolo May 08 '22 at 00:43