⚠️⚠️⚠️ This site is deprecated. Click here to visit the new site and content. ⚠️⚠️⚠️

Exploring Vision — Automatic Face Detection and Cropping for Profile Pictures (Swift)

Practical Application of Vision

[2024/08/13 Update]

Refer to the new article and API: “iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session”

Without further ado, here is a comparison image:

$Before Optimization V.S. After Optimization — [Marry Me APP](https://itunes.apple.com/tw/app/%E7%B5%90%E5%A9%9A%E5%90%A7-%E4%B8%8D%E6%89%BE%E6%9C%80%E8%B2%B4-%E5%8F%AA%E6%89%BE%E6%9C%80%E5%B0%8D/id1356057329?ls=1&mt=8){:target="_blank"}$

Before Optimization V.S. After Optimization — Marry Me APP

With the recent iOS 12 update, I noticed the new CoreML machine learning framework and found it quite interesting. I began to think about how to incorporate it into our current products.

The article on trying out CoreML is now available: Automatically Predict Article Categories Using Machine Learning, Even Train the Model Yourself

CoreML provides the ability to train and reference machine learning models for text and images in an app. Initially, I thought of using CoreML for face recognition to address the issue of cropping heads or faces in the app, as shown on the left in the image above. Faces can easily be cut off due to scaling and cropping if they appear at the edges.

After some online research, I realized my knowledge was limited, and this functionality was already available in iOS 11 through the “Vision” framework, which supports text detection, face detection, image comparison, QR code detection, object tracking, and more.

In this case, I utilized the face detection feature from Vision and optimized it as shown on the right in the image; finding faces and cropping around them.

Let’s get started with the practical implementation:

First, let’s create a feature that can mark the position of faces and get familiar with how to use Vision.

Demo APP

As shown in the completed image above, it can mark the positions of faces in the photo.

P.S. It can only mark “faces,” not the entire head including hair 😅

This program mainly consists of two parts. The first part addresses the issue of white space when resizing the original image to fit into an ImageView. In simple terms, we want the ImageView size to match the image size. Directly inserting the image can cause misalignment as shown below.

You might consider changing the ContentMode to fill, fit, or redraw, but this may cause distortion or cropping of the image.

  
let ratio = UIScreen.main.bounds.size.width
// Here, I set the alignment of my UIImageView to 0 on both sides, with an aspect ratio of 1:1

let sourceImage = UIImage(named: "Demo2")?.kf.resize(to: CGSize(width: ratio, height: CGFloat.leastNonzeroMagnitude), for: .aspectFill)
// Using KingFisher's image resizing feature, based on width, with flexible height

imageView.contentMode = .redraw
// Using redraw to fill the contentMode

imageView.image = sourceImage
// Assigning the image

imageViewConstraints.constant = (ratio - (sourceImage?.size.height ?? 0))
imageView.layoutIfNeeded()
imageView.sizeToFit()
// Here, I adjust the constraints of the imageView. For more details, refer to the complete example at the end of the document

Here is the translated content:

The above is the processing for images.

The cropping part uses Kingfisher to assist us, and can also be replaced with other libraries or custom methods.

Next, let’s focus on the code directly.

  
if #available(iOS 11.0, *) {
    // Supported after iOS 11
    let completionHandle: VNRequestCompletionHandler = { request, error in
        if let faceObservations = request.results as? [VNFaceObservation] {
            // Recognized faces
            
            DispatchQueue.main.async {
                // Operate on UIView, switch back to the main thread
                let size = self.imageView.frame.size
                
                faceObservations.forEach({ (faceObservation) in
                    // Coordinate system conversion
                    let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
                    let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
                    let transRect =  faceObservation.boundingBox.applying(translate).applying(transform)
                    
                    let markerView = UIView(frame: transRect)
                    markerView.backgroundColor = UIColor.init(red: 0/255, green: 255/255, blue: 0/255, alpha: 0.3)
                    self.imageView.addSubview(markerView)
                })
            }
        } else {
            print("No faces detected")
        }
    }
    
    // Recognition request
    let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
    let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
    DispatchQueue.global().async {
        // Recognition takes time, so it is executed in the background thread to avoid freezing the current screen
        do{
            try faceHandle.perform([baseRequest])
        }catch{
            print("Throws: \(error)")
        }
    }
  
} else {
    //
    print("Not supported")
}

The main thing to note is the coordinate system conversion part; the results recognized are in the original coordinates of the image; we need to convert it to the actual coordinates of the ImageView outside to use it correctly.

Next, let’s focus on today’s highlight - cropping the correct position of the avatar according to the position of the face.

  
let ratio = UIScreen.main.bounds.size.width
// Here, because I set the left and right alignment of my UIImageView to 0, with a ratio of 1:1, details can be found in the complete example at the end

let sourceImage = UIImage(named: "Demo")

imageView.contentMode = .scaleAspectFill
// Use scaleAspectFill mode to fill

imageView.image = sourceImage
// Assign the original image, we will operate on it later

if let image = sourceImage, #available(iOS 11.0, *), let ciImage = CIImage(image: image) {
    let completionHandle: VNRequestCompletionHandler = { request, error in
        if request.results?.count == 1, let faceObservation = request.results?.first as? VNFaceObservation {
            // One face
            let size = CGSize(width: ratio, height: ratio)
            
            let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
            let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
            let finalRect =  faceObservation.boundingBox.applying(translate).applying(transform)
            
            let center = CGPoint(x: (finalRect.origin.x + finalRect.width/2 - size.width/2), y: (finalRect.origin.y + finalRect.height/2 - size.height/2))
            // Here is the calculation of the middle point position of the face range
            
            let newImage = image.kf.resize(to: size, for: .aspectFill).kf.crop(to: size, anchorOn: center)
            // Crop the image according to the center point
            
            DispatchQueue.main.async {
                // Operate on UIView, switch back to the main thread
                self.imageView.image = newImage
            }
        } else {
            print("Detected multiple faces or no faces detected")
        }
    }
    let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
    let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
    DispatchQueue.global().async {
        do{
            try faceHandle.perform([baseRequest])
        }catch{
            print("Throws: \(error)")
        }
    }
} else {
    print("Not supported")
}

The logic is similar to marking the position of a face, the difference is that the avatar part has a fixed size (e.g. 300x300), so we skip the first part that requires the Image to fit the ImageView.

Another difference is that we need to calculate the center point of the face area and use this center point as the reference for cropping the image.

The red dot is the center point of the face area.

Final effect image:

The second before the blink is the original image position.

Complete app example:

The code has been uploaded to Github: Click here

For any questions or suggestions, feel free to contact me.

===

本文中文版本

===

This article was first published in Traditional Chinese on Medium ➡️ View Here

⚠️⚠️⚠️ This site is deprecated. Click here to visit the new site and content. ⚠️⚠️⚠️

Exploring Vision — Automatic Face Detection and Cropping for Profile Pictures (Swift)

⚠️⚠️⚠️ This site is deprecated. Click here to visit the new site and content. ⚠️⚠️⚠️

Exploring Vision — Automatic Face Detection and Cropping for Profile Pictures (Swift)

[2024/08/13 Update]

Without further ado, here is a comparison image:

Let’s get started with the practical implementation:

First, let’s create a feature that can mark the position of faces and get familiar with how to use Vision.

Next, let’s focus on today’s highlight - cropping the correct position of the avatar according to the position of the face.

Final effect image:

Complete app example:

⚠️⚠️⚠️ This site is deprecated. Click here to visit the new site and content. ⚠️⚠️⚠️

Further Reading

Exploring iOS 12 CoreML — Automatically Predict Article Categories Using Machine Learning, Even Train the Model Yourself!

iOS UITextView Text Wrapping Editor (Swift)

iOS ≥ 10 Notification Service Extension Application (Swift)