0

I am currently using Facenet to build a facial Detection and recognition application. The first part takes images from the webcam, detects the Face of the person from the webcam using the MTCNN model. After that it stores the images in a folder. Then I decided to use ImageDataGenerator in that folder to create more images for the dataset, but by datagen gives the resultant images in grayscale format. Here's the code for it:

datagen = ImageDataGenerator(rotation_range=40,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.2,zoom_range=0.2,horizontal_flip=True,fill_mode='nearest',rescale=False)

This is the flow function:

for train_img in train_images:
img = image.img_to_array(train_img)  # convert image to numpy array
img = img.reshape((1,) + img.shape)  # reshape image
i = 0
datagen.fit(img)
for batch in datagen.flow(img, save_format='png',save_to_dir=train_path):  # this loops runs forever until we break, saving images to current directory with specified prefix
    i += 1
    if i > 10:  # Make 10 Augmentation of every Images
        break

Please help.

  • Does `img` have three channels after `img = img.reshape((1,) + img.shape)` or `datagen.fit(img)`? It sounds like dropping those channels for some reason. Do you also have to use `flow()` if it isn't changing before then? If not, couldn't you use something like OpenCV to save the images? – Djinn Jul 01 '22 at 05:13
  • It has two channels in the beginning and three channels after reshape. – Ishan Sharma Jul 01 '22 at 06:50
  • Two channels? It should either be one channel (gray), three channels (RBG), or four channels (CMYK). If it has two channels at the beginning, are you sure the images aren't already grayscale when loaded or loaded as grayscale? If the two channels are actually grayscale + alpha, alpha is mostly likely being discarded on load. – Djinn Jul 01 '22 at 21:17
  • in the beginning it now has the channel values (160,160,1) and then (1,160,160,1) after the resize. Sorry for being late i had food poisoning for 2 days. plus wanted to ask if the face detection code would work with normal rgb images that while rotating my face or is data augmentaion the best way – Ishan Sharma Jul 04 '22 at 10:12
  • I just don't see the purpose of using `reshape()` here. Unless you're changing the size of the image, just change the height and width. For a 2d image represented by array with four dimensions, the first (leftmost) dimension is the number of images. It shouldn't explicitly be 1 per image, but rather implied. As in, it should just be (160, 160, 1). It depends on the face detection method used, some don't care about rotations. – Djinn Jul 04 '22 at 15:27
  • If the images are loaded as (160, 160, 1), then they're already grayscale or are being read as grayscale. – Djinn Jul 04 '22 at 15:29
  • i changed somethings in my code and now they're being read as (160,160,3) plus i removed the reshape part and now it is working pretty nicely thank you @Djinn – Ishan Sharma Jul 05 '22 at 03:28

0 Answers0