3

Problem

I am looking for the best Python object detection method to detect professional headshot images that typically have a solid background. I would expect that professional headshots should be the best case scenario for face recognition, and give much better results than the 95% Haar success rate for much less optimal images. Processing speed should not be significantly less than 1000 images in 20 seconds. The code I provided below can process 1000 images in 10 seconds. I would appreciate any suggestions.

EDIT: Poor performance is from the Haar cascade finding two or more faces in a head shot. My code rejects images with anything but one face. One of the additional "faces" is usually found on the neck. A scaleFactor of 1.3 gives the best results that I can expect. Raising it or lowering it gives more false negatives.

The headshots are from yearbooks and here's an example image:

Description of My Sample Collection

My sample set is approximately 1,000 images with 800 positive samples (headshots) and 200 negative samples (not headshots). The negative samples can be of groups of people or full body images of a single person. All headshots have a size of around 160 x 200 pixels (1.25 aspect ratio). Negative images vary in size up to 500 x 500 pixels.

My Code

face_cascade = cv2.CascadeClassifier('/haarcascade_frontalface_default.xml')
for img_file in img_files:
    img = cv2.imread(sample_dir + img_file)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    height, width = img.shape[:2]
    # min size of face w.r.t. image (using max was intentional)
    min_size = int(max(0.4*width, 0.3*height))
    faces = face_cascade.detectMultiScale(gray,
                                          scaleFactor=1.3,
                                          minNeighbors=3,
                                          minSize=(min_size, min_size))
    if len(faces) == 1: # want only one face
        cv2.imwrite(positive_dir + img_file, img)
    else:
        cv2.imwrite(negative_dir + img_file, img)

Unexpected Behavior from Code

  1. The number of false negatives increased as I reduced the scaleFactor. 1.3 gave 2%, 1.2 gave 4%, 1.1 gave 6% approximately. I thought higher values resulted in fewer detections (i.e. more false negatives). Have I misunderstood?
  2. When I ran only positive examples, results were much better for false negatives (only 4/10 of 1%, which is an excellent result). It's as if the detectMulitScale was training during run-time. Additionally, when the negative samples were at the beginning of my loop (my typical situation), the number of false-negatives and false-positives dropped compared to negative samples being interspersed or at end of loop. Is detectMulitScale capable of run-time training? Could not reproduce.

Option of Training Haar or LBP

If I train Haar or LBP, I would expect I could only get better results than the haarcascade_frontalface_default.xml if I use the entire photo rather than just the face due to a solid background and shoulders. Does this make sense? Can training only be done on a 1:1 aspect ratio? 1.25 would be better, if possible.

Jakub
  • 489
  • 3
  • 13
  • there is no online,-training and your detector should give sane results for one image, the order of images shouldn't matter. During training you can choose any aspecr ratio. The positive training samples have to be cropped to the region you want to detect. You should give deep learning a try, the tiny yolo v3 detector is very fast, too. – Micka Jul 15 '20 at 04:41
  • Micka, I did some more troubleshooting after reading your comment. I could not reproduce the order issue. See my edits. I will try Tiny-YOLO as you suggested. Thanks. – Jakub Jul 15 '20 at 06:54

0 Answers0