Cut out face parts with Vision face recognition (eyebrows, eyes, nose, mouth)

MLBoy
2 min readJul 25, 2020

You can use it for beauty apps and for data sets of facial parts.

With face landmark detection in Vision, you can get points such as eyebrows, eyes, nose, and mouth. This time, we will use those points to crop each part.

Convert landmark points to the coordinates of the original image

Since the landmarks’ normalized CGPoint is the normalized coordinates in the bounding box of the face, VNImagePointForFaceLandmarkPoint converts them to coordinates in the whole image.

extension CGPoint {
func convertToImagePoint(_ originalImage:CIImage,_ boundingBox:CGRect)->CGPoint {
let imageWidth = originalImage.extent.width
let imageHeight = originalImage.extent.height
let vectoredPoint = vector2(Float(self.x),Float(self.y))
let vnImagePoint = VNImagePointForFaceLandmarkPoint(vectoredPoint,boundingBox, Int(imageWidth), Int(imageHeight))
let imagePoint = CGPoint(x: vnImagePoint.x, y: vnImagePoint.y)
return imagePoint
}
}

After that, all you have to do is to find the maximum and minimum values of x and y of the points you have taken, set it to CGRect, and cut it out from the original image.
If you use the point as it is, it is difficult to understand because it is cut out exactly, so it is easier to understand what part the padding is.

func cropParts(partsPoints points:[CGPoint],horizontalSpacing hPadding:CGFloat, verticalSpacing vPadding:CGFloat, originalImage image:CIImage)->UIImage?{
if let Minx = points.min(by: { a,b -> Bool in
a.x < b.x
}),
let Miny = points.min(by: { a,b -> Bool in
a.y < b.y
}),
let Maxx = points.max(by: { a,b -> Bool in
a.x < b.x
}),
let Maxy = points.max(by: { a,b -> Bool in
a.y < b.y
}) {
let partsWidth = Maxx.x - Minx.x
let partsHeight = Maxy.y - Miny.y
let partsBox = CGRect(x: Minx.x - (partsWidth * hPadding), y: Miny.y - (partsHeight * vPadding), width: partsWidth + (partsWidth * hPadding * 2), height: partsHeight + (partsHeight * vPadding * 2))
let croppedImage = image.cropped(to: partsBox)
guard let final = ciContext.createCGImage(croppedImage, from: croppedImage.extent) else {return nil}
let partsuiimage = UIImage(cgImage: final)
return partsuiimage
} else {
return nil
}
}

Be careful of images of multiple people

Vision’s VNDetectFaceLandmarksRequest processes multiple people simultaneously from a single image. I have only taken up to 3 people, but with that many people, I can take a clear landmark. However, people such as those reflected in the background may be detected, and to prevent this, it will be necessary to filter by the size of the Bounding Box.

My homebrew app using this technology

BeautyCaption

Follow me on Twitter. Please.

I am making an application that uses Core ML, Create ML, and Vision.
If you would like to use the model of machine learning with edge device (iOS), please contact us by email.
rockyshikoku@gmail.com

And clap your hands👏

Chao!

--

--