0

I have a video of a patient doing cognitive tasks. The goal is to take each frame of the video, do face detection and landmark the mouth, then calculate the area bounded by the mouth landmarks. I used dlib and its python API to do this.

I ran into two problems. First, the patient is in a hospital bed and the camera view is slightly angled upward, looking up at the chin instead of directly at the face. The face isn't being detected in a decent number of frames, and if there is no face detected, the algorithm doesn't try to do landmarking so there are no mouth perimeter points. I am wondering if there is a way I can improve the face detection (maybe training the object detection specifically on a few frames of the patient?)

The second problem is that from frame to frame, the mouth landmarks can vary pretty significantly. I was hoping that at the end of this I could show a slow motion word being spoken and the mouth perimeter smoothly increasing and decreasing as the mouth opened and closed. But, a result is quite noisy with a good bit of variation.

I'm definitely method/platform agnostic. If anyone knows of a better, more accurate or robust way to do this, maybe with Matlab or OpenCV, I am open to chasing the lead. Any guidance would be helpful.

Thanks, everyone.

Arghavan
  • 1,125
  • 1
  • 11
  • 17
Srdjn
  • 23
  • 4
  • Can you attach some sample input images ? – ZdaR Sep 12 '17 at 07:24
  • Unfortunately I can't for patient privacy. But the pose is reclined in a hospital bed, and if you imagine a camera rested about where the food tray table would be, that is the angle of the image. – Srdjn Sep 13 '17 at 03:46
  • Train your own face detector and then pass it to landmark detection. The landmarks will wiggle, to eliminate you can experiment with Kalman filter. – harshkn Sep 15 '17 at 07:19

0 Answers0