How to extract outer lips from its feature point using vision framework in swift

Question

I implemented addFaceLandmarksToImage function to crop the outer lips of the image. addFaceLandmarksToImage function first detects the face on the image using vision, convert the face bounding box size and origin to image size and origin. then I used face landmark outer lips to get the normalized point of the outer lips and draw the line on the outer lips of the image by connecting all the normalized points. Then I implemented logic to crop the image. I first converted the normalized point of the outer lips into the image coordinate and used the find point method to get left, right, top and bottom-most points and extracted outer lips by cropping it and showed in processed image view. The issue with this function is output following is not as expected. Also if I use an image other then the one that is in the following link, it crops images other then outer lips. I could not figure out where have I gone wrong, is it on calculation of cropping rect or should I use another approach(using OpenCV and region of interest (ROI)) to extract the outer lips from the image? video of an application

func addFaceLandmarksToImage(_ face: VNFaceObservation) 

{       
    UIGraphicsBeginImageContextWithOptions(image.size, true, 0.0)
    let context = UIGraphicsGetCurrentContext()

    // draw the image
    image.draw(in: CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height))

    context?.translateBy(x: 0, y: image.size.height)
    context?.scaleBy(x: 1.0, y: -1.0)

    // draw the face rect
    let w = face.boundingBox.size.width * image.size.width
    let h = face.boundingBox.size.height * image.size.height
    let x = face.boundingBox.origin.x * image.size.width
    let y = face.boundingBox.origin.y * image.size.height
    let cropFace = self.image.cgImage?.cropping(to: CGRect(x: x, y: y, width: w, height: h))
    let ii = UIImage(cgImage: cropFace!)

    // outer lips
    context?.saveGState()
    context?.setStrokeColor(UIColor.yellow.cgColor)

     if let landmark = face.landmarks?.outerLips {
         var actualCordinates = [CGPoint]()
         print(landmark.normalizedPoints)
         for i in 0...landmark.pointCount - 1 { 
           // last point is 0,0
           let point = landmark.normalizedPoints[i]
           actualCordinates.append(CGPoint(x: x + CGFloat(point.x) * w, y: y + CGFloat(point.y) * h))
           if i == 0 {
              context?.move(to: CGPoint(x: x + CGFloat(point.x) * w, y: y + CGFloat(point.y) * h))
           } else {
              context?.addLine(to: CGPoint(x: x + CGFloat(point.x) * w, y: y + CGFloat(point.y) * h))
           }
      }
     // Finding left,right,top,buttom point from actual coordinates points[CGPOINT]

     let leftMostPoint = self.findPoint(points: actualCordinates, position: .leftMost)
     let rightMostPoint = self.findPoint(points: actualCordinates, position: .rightMost)
     let topMostPoint = self.findPoint(points: actualCordinates, position: .topMost)
     let buttonMostPoint = self.findPoint(points: actualCordinates, position: .buttonMost)

     print("actualCordinates:",actualCordinates,
           "leftMostPoint:",leftMostPoint,
           "rightMostPoint:",rightMostPoint,
           "topMostPoint:",topMostPoint,
           "buttonMostPoint:",buttonMostPoint)

     let widthDistance = -(leftMostPoint.x - rightMostPoint.x)
     let heightDistance = -(topMostPoint.y - buttonMostPoint.y)

     //Cropping the image.
     // self.image is actual image 
     let cgCroppedImage = self.image.cgImage?.cropping(to: CGRect(x: leftMostPoint.x,y: leftMostPoint.x - heightDistance,width:1000,height: topMostPoint.y + heightDistance + 500))
     let jj = UIImage(cgImage: cgCroppedImage!)
     self.processedImageView.image = jj    
   }
   context?.closePath()
   context?.setLineWidth(8.0)
   context?.drawPath(using: .stroke)
   context?.saveGState()

   // get the final image
   let finalImage = UIGraphicsGetImageFromCurrentImageContext()

   // end drawing context
   UIGraphicsEndImageContext()

   imageView.image = finalImage
 }}

normalized points of the outer lips of an image:-

[(0.397705078125, 0.3818359375), 
(0.455322265625, 0.390625), 
(0.5029296875, 0.38916015625), 
(0.548828125, 0.40087890625), 
(0.61279296875, 0.3984375), 
(0.703125, 0.37890625), 
(0.61474609375, 0.21875), 
(0.52294921875, 0.1884765625), 
(0.431640625, 0.20166015625), 
(0.33203125, 0.34423828125)]

actual coordinate points of the outer lips of an image: -

[(3025.379819973372, 1344.4951847679913),
 (3207.3986613331363, 1372.2607707381248),
 (3357.7955853380263, 1367.633173076436),
 (3502.7936454042792, 1404.6539543699473),
 (3704.8654099646956, 1396.9412916004658),
 (3990.2339324355125, 1335.2399894446135),
 (3711.035540180281, 829.2893117666245),
 (3421.039420047775, 733.6522934250534),
 (3132.5858324691653, 775.3006723802537),
 (2817.9091914743185, 1225.7201781179756)]

I also tried using the following method that uses CIDetector to get the mouth position and extracting the outer lips through cropping. The output wasn't good.

func focusonMouth() {
        let ciimage = CIImage(cgImage: image.cgImage!)
        let options = [CIDetectorAccuracy: CIDetectorAccuracyHigh]
        let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil, options: options)!
        let faces = faceDetector.features(in: ciimage)

        if let face = faces.first as? CIFaceFeature {
            if face.hasMouthPosition {
                let crop = image.cgImage?.cropping(to: CGRect(x: face.mouthPosition.x, y: face.mouthPosition.y, width: face.bounds.width - face.mouthPosition.x , height: 200))
                processedImageView.image = imageRotatedByDegrees(oldImage: UIImage(cgImage: crop!), deg: 90)
            }
        }

    }

The cropping rectangle you're using seems wrong. Why does `y` use the `leftMostPoint.x`? Why is the width 1000? My suggestion is to draw the cropping rect as an actual rectangle on the image first so you can see what you're actually computing. — Matthijs Hollemans, Nov 20 '19 at 09:39
I tried using different values but I could not figure out the correct rectangle for cropping. So I used static width I know which is very wrong. — Roman, Nov 20 '19 at 11:48
Keep in mind that Vision's coordinates have (0,0) in the bottom-left corner and (1,1) in the upper-right corner. — Matthijs Hollemans, Nov 20 '19 at 11:53
@MatthijsHollemans should I use OpenCV, region of interest (ROI)) to extract the outer lips from the image or use proper mathematical calculation to find the cropping rectangle? — Roman, Nov 20 '19 at 12:10
I think using the proper calculation is always a good idea. ;-) No need to use OpenCV. You have everything you require already, you just need to make sure to convert the Vision coordinates to image coordinates (which already happens in the drawing code). — Matthijs Hollemans, Nov 20 '19 at 17:43

score 0 · Answer 1 · answered Nov 21 '19 at 14:04

There are two problems:

input image on iOS can be rotated with orientation property marking how it is rotated. Vision Framework will do the job, but coordinates will be rotated. The simplest solution is to supply image that is up-oriented (normal rotation).
position and size of located landmarks are relative to position and size of the detected face landmarks. So found position and size should be scaled by size and offset of the face found, not whole image.

How to extract outer lips from its feature point using vision framework in swift

1 Answers1