Problem
I am looking for the best Python
object detection method to detect professional headshot images that typically have a solid background. I would expect that professional headshots should be the best case scenario for face recognition, and give much better results than the 95% Haar
success rate for much less optimal images. Processing speed should not be significantly less than 1000 images in 20 seconds. The code I provided below can process 1000 images in 10 seconds. I would appreciate any suggestions.
EDIT: Poor performance is from the Haar cascade finding two or more faces in a head shot. My code rejects images with anything but one face. One of the additional "faces" is usually found on the neck. A scaleFactor of 1.3 gives the best results that I can expect. Raising it or lowering it gives more false negatives.
The headshots are from yearbooks and here's an example image:

Description of My Sample Collection
My sample set is approximately 1,000 images with 800 positive samples (headshots) and 200 negative samples (not headshots). The negative samples can be of groups of people or full body images of a single person. All headshots have a size of around 160 x 200 pixels (1.25 aspect ratio). Negative images vary in size up to 500 x 500 pixels.
My Code
face_cascade = cv2.CascadeClassifier('/haarcascade_frontalface_default.xml')
for img_file in img_files:
img = cv2.imread(sample_dir + img_file)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
height, width = img.shape[:2]
# min size of face w.r.t. image (using max was intentional)
min_size = int(max(0.4*width, 0.3*height))
faces = face_cascade.detectMultiScale(gray,
scaleFactor=1.3,
minNeighbors=3,
minSize=(min_size, min_size))
if len(faces) == 1: # want only one face
cv2.imwrite(positive_dir + img_file, img)
else:
cv2.imwrite(negative_dir + img_file, img)
Unexpected Behavior from Code
- The number of false negatives increased as I reduced the scaleFactor. 1.3 gave 2%, 1.2 gave 4%, 1.1 gave 6% approximately. I thought higher values resulted in fewer detections (i.e. more false negatives). Have I misunderstood?
When I ran only positive examples, results were much better for false negatives (only 4/10 of 1%, which is an excellent result). It's as if theCould not reproduce.detectMulitScale
was training during run-time. Additionally, when the negative samples were at the beginning of my loop (my typical situation), the number of false-negatives and false-positives dropped compared to negative samples being interspersed or at end of loop. IsdetectMulitScale
capable of run-time training?
Option of Training Haar or LBP
If I train Haar
or LBP
, I would expect I could only get better results than the haarcascade_frontalface_default.xml if I use the entire photo rather than just the face due to a solid background and shoulders. Does this make sense? Can training only be done on a 1:1 aspect ratio? 1.25 would be better, if possible.