Home Exploring Vision — APP Avatar Upload Automatic Face Detection and Cropping (Swift)
Post
Cancel

Exploring Vision — APP Avatar Upload Automatic Face Detection and Cropping (Swift)

Exploring Vision — APP Avatar Upload Automatic Face Detection and Cropping (Swift)

Practical application of Vision

Without further ado, here’s a finished product image:

Before Optimization V.S After Optimization — [Wedding App](https://itunes.apple.com/tw/app/%E7%B5%90%E5%A9%9A%E5%90%A7-%E4%B8%8D%E6%89%BE%E6%9C%80%E8%B2%B4-%E5%8F%AA%E6%89%BE%E6%9C%80%E5%B0%8D/id1356057329?ls=1&mt=8){:target="_blank"}

Before Optimization V.S After Optimization — Wedding App

Recently, with the release of iOS 12, I noticed the newly opened CoreML machine learning framework; it seemed quite interesting, so I started thinking about where it could be applied in our current product.

CoreML introductory article now published: Automatically Predict Article Categories Using Machine Learning, Even Training the Model Yourself

CoreML provides interfaces for training and referencing text and image machine learning models in an app. My initial idea was to use CoreML for face recognition to solve the problem of heads or faces being cropped out in the app, as shown on the left in the image above. If a face appears around the edges, it is easily cut off due to scaling and cropping.

After some online research, I realized my knowledge was limited. This feature was already released in iOS 11: the “Vision” framework, which supports text detection, face detection, image matching, QR code detection, object tracking, and more.

Here, we use the face detection feature, optimized as shown on the right; it finds the face and crops the image centered on it.

Let’s Get Started:

First, let’s create a feature that can mark the face position to get a preliminary understanding of how to use Vision.

Demo APP

Demo APP

The completed image above shows the ability to mark the position of faces in a photo.

p.s It can only mark “faces,” not the entire head including hair 😅

This code is mainly divided into two parts. The first part addresses the issue of leaving blank spaces when scaling the original image size to fit into an ImageView. Simply put, we want the ImageView to be the same size as the Image. If you directly insert the image, it will cause misalignment as shown below.

You might think of directly changing the ContentMode to fill, fit, or redraw, but this will distort or crop the image.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
let ratio = UIScreen.main.bounds.size.width
// This is because I set the UIIMAGEVIEW to align left and right with a 0 margin, and the aspect ratio is 1:1

let sourceImage = UIImage(named: "Demo2")?.kf.resize(to: CGSize(width: ratio, height: CGFloat.leastNonzeroMagnitude), for: .aspectFill)
// Using KingFisher's image resizing feature, based on width, with height being flexible

imageView.contentMode = .redraw
// Set contentMode to redraw to fill

imageView.image = sourceImage
// Assign the image

imageViewConstraints.constant = (ratio - (sourceImage?.size.height ?? 0))
imageView.layoutIfNeeded()
imageView.sizeToFit()
// This part changes the imageView's constraints. For details, see the complete example at the end of the article.

Here is the processing for images

The cropping part uses Kingfisher to help us, but you can replace it with other libraries or custom methods.

The second part, let’s get straight to the code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
if #available(iOS 11.0, *) {
    // Supported only after iOS 11
    let completionHandle: VNRequestCompletionHandler = { request, error in
        if let faceObservations = request.results as? [VNFaceObservation] {
            // Recognized faces
            
            DispatchQueue.main.async {
                // Operate UIView, switch back to the main thread
                let size = self.imageView.frame.size
                
                faceObservations.forEach({ (faceObservation) in
                    // Coordinate system transformation
                    let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
                    let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
                    let transRect =  faceObservation.boundingBox.applying(translate).applying(transform)
                    
                    let markerView = UIView(frame: transRect)
                    markerView.backgroundColor = UIColor.init(red: 0/255, green: 255/255, blue: 0/255, alpha: 0.3)
                    self.imageView.addSubview(markerView)
                })
            }
        } else {
            print("No faces detected")
        }
    }
    
    // Recognition request
    let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
    let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
    DispatchQueue.global().async {
        // Recognition takes time, so put it in the background thread to avoid blocking the current screen
        do{
            try faceHandle.perform([baseRequest])
        }catch{
            print("Throws:\(error)")
        }
    }
  
} else {
    //
    print("Not supported")
}

The main thing to note is the coordinate system transformation part; the recognized result is the original coordinates of the image; we need to convert it to the actual coordinates of the enclosing ImageView to use it correctly.

Next, let’s do the main task of today — cropping the correct position of the avatar according to the face position

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
let ratio = UIScreen.main.bounds.size.width
// This is because I set the UIImageView to align left and right to 0, with an aspect ratio of 1:1. For details, see the complete example at the end of the article

let sourceImage = UIImage(named: "Demo")

imageView.contentMode = .scaleAspectFill
// Use scaleAspectFill mode to fill

imageView.image = sourceImage
// Directly assign the original image, we will operate on it later

if let image = sourceImage,#available(iOS 11.0, *),let ciImage = CIImage(image: image) {
    let completionHandle: VNRequestCompletionHandler = { request, error in
        if request.results?.count == 1,let faceObservation = request.results?.first as? VNFaceObservation {
            // One face
            let size = CGSize(width: ratio, height: ratio)
            
            let translate = CGAffineTransform.identity.scaledBy(x: size.width, y: size.height)
            let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -size.height)
            let finalRect =  faceObservation.boundingBox.applying(translate).applying(transform)
            
            let center = CGPoint(x: (finalRect.origin.x + finalRect.width/2 - size.width/2), y: (finalRect.origin.y + finalRect.height/2 - size.height/2))
            // Here is to calculate the midpoint position of the face range
            
            let newImage = image.kf.resize(to: size, for: .aspectFill).kf.crop(to: size, anchorOn: center)
            // Crop the image according to the midpoint
            
            DispatchQueue.main.async {
                // Operate UIView, switch back to the main thread
                self.imageView.image = newImage
            }
        } else {
            print("Detected multiple faces or no face detected")
        }
    }
    let baseRequest = VNDetectFaceRectanglesRequest(completionHandler: completionHandle)
    let faceHandle = VNImageRequestHandler(ciImage: ciImage, options: [:])
    DispatchQueue.global().async {
        do{
            try faceHandle.perform([baseRequest])
        }catch{
            print("Throws:\(error)")
        }
    }
} else {
    print("Not supported")
}

The principle is similar to marking the position of a face, but the difference is that the profile picture is of a fixed size (e.g., 300x300), so we skip the first part where the Image needs to fit the ImageView.

Another difference is that we need to calculate the center point of the face area and use this center point as the basis for cropping the image.

The red dot is the center point of the face area

The red dot is the center point of the face area

Final effect diagram:

The moment before the pause is the original image position

The moment before the pause is the original image position

Complete APP example:

The code has been uploaded to Github: Click here

If you have any questions or comments, feel free to contact me.

===

本文中文版本

===

This article was first published in Traditional Chinese on Medium ➡️ View Here


This post is licensed under CC BY 4.0 by the author.

iOS ≥ 10 Notification Service Extension Application (Swift)

Exploring iOS 12 CoreML — Automatically Predict Article Categories Using Machine Learning, Even Train the Model Yourself!