U2Net to CoreML [Machine learning segmentation on iPhone]

4 min readDec 10, 2021

High-precision segmentation model on mobile devices

U2Net is a machine learning model that separates prominent objects in images from the background.

There are various segmentation models that correspond to specific objects such as people, but this U2Net has a wide range of uses because it segments the most prominent object in the image.
It is used in various apps with high accuracy.

A model originally written in python.
By converting this to CoreML, it can be used on-device on iOS.
It can be done with just an iPhone chip.
Moreover, it is quite fast.
The accuracy is the same as the original model.

Converted model

It’s on CoreML Models on GitHub.

GitHub - john-rocky/CoreML-Models: Converted CoreML Models

Converted CoreML Models You can get the model converted to CoreML format from the link of Google drive. See the section…

github.com

Google Colab notebook demo of conversion code

Google Colaboratory

Edit description

colab.research.google.com

Original project

xuebinqin / U-2-Net

GitHub - xuebinqin/U-2-Net: The code for our newly accepted paper in Pattern Recognition 2020…

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for…

github.com

paper

U$^2$-Net: Going Deeper with Nested U-Structure for Salient Object Detection

In this paper, we design a simple yet powerful deep network architecture, U$^2$-Net, for salient object detection…

arxiv.org

Applications that use U2Net

Remove Background from Photo & Video Automatically

As a video and photo editor, you may come across basic editing work that requires you to change and remove the…

remove-background.net

Pixelmator Pro

Pixelmator Pro is a powerful, beautiful, and easy to use image editor packed full of innovations.

www.pixelmator.com

Version

There is a 176.3MB version and a 4.6MB lightweight version.
The following shows the conversion procedure for the 176.3MB version,
I also commented out several places to change in the case of the lightweight version.

Conversion procedure

Clone the original project.

git clone https://github.com/xuebinqin/U-2-Net.git

Install CoreML Tools.

pip install coremltools

Move to the project directory.

cd U-2-Net/

Import the required modules.

from model import U2NET 
# For the lightweight version is:
# from model import U2NETP 
import coremltools as ct 
from coremltools.proto 
import FeatureTypes_pb2 as ft 
import torch 
import os 
from PIL import Image 
from torchvision import transforms

Get pre-trained weights. You can get it from the Google Drive link of the
original project .
176.3MB
4.7 MB
for portraits

Initialize the U2Net model.

net = U2NET(3,1)
# Lightweight Version：
# net = U2NETP(3,1)
device = torch.device('cpu')

###Path To Model Directory.
model_dir = os.path.join(os.getcwd(), '/content/drive/MyDrive', "u2net" + '.pth') 
# Lightweight Version: u2netp.pth

net.load_state_dict(torch.load(model_dir, map_location=device))
net.cpu()
net.eval()

Make a dummy input. The actual image is fine.

example_input = torch.rand(1,3,320,320)

conversion.
Set the input bias to the color channel and scale for the model.

(Refered to the following issue in the conversion process.)

Core ML format · Issue #118 · xuebinqin/U-2-Net

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

traced_model = torch.jit.trace(net, example_input)
model = ct.convert(traced_model, inputs=[ct.ImageType(name="input", shape=example_input.shape,bias=[-0.485/0.229,-0.456/0.224,-0.406/0.225],scale=1.0/255.0/0.226)])

Add model metadata.

model.short_description = "U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection"
model.license = "Apache 2.0"
model.author = "Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin"

Add a new activation layer to the end of the
model and turn the output of the CoreML model into a grayscale image.

#Add activation layer.spec = model.get_spec() 
spec_layers = getattr(spec, spec.WhichOneof("Type")).layers 
output_layers = [] for layer in spec_layers:     
    if layer.name[:2] == "25":
        print("name: %s input: %s output: %s" % (layer.name, layer.input, layer.output))
        output_layers.append(layer)
new_layers = [] 
layernum = 0;
for layer in output_layers:
     new_layer = spec_layers.add()
     new_layer.name = 'out_p'+str(layernum)
     new_layers.append('out_p'+str(layernum))
     new_layer.activation.linear.alpha=255
     new_layer.activation.linear.beta=0
     new_layer.input.append('var_'+layer.name)
     new_layer.output.append('out_p'+str(layernum))
     output_description = next(x for x in spec.description.output if x.name==output_layers[layernum].output[0])
     output_description.name = new_layer.name
     layernum = layernum + 1# Make output GrayScale image.
 
for output in spec.description.output:
    if output.name not in new_layers:
        continue
    if output.type.WhichOneof('Type') != 'multiArrayType':
        raise ValueError("%s is not a multiarray type" % output.name)
    output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('GRAYSCALE')     
    output.type.imageType.width = 320
    output.type.imageType.height = 320# save updated model.
 
updated_model = ct.models.MLModel(spec) updated_model.save("u2net.mlmodel")

Up to this point, you can get the U2Net model in CoreML format.
When you open the model, it will be displayed as follows.

The output has 7 channels up to out_p6.
Out_p1 can be used as a mask image.

Use the model in your Xcode project

There are two options for using it in a project.

1. When using the CoreML framework

2. When using Vision

This is recommended because it is convenient for automatic adjustment of the input image size.

In both cases, the input and output are 320 * 320 CVPixelBuffer.

🐣

I’m a freelance engineer.
Work consultation
Please feel free to contact us with a brief development description.
rockyshikoku@gmail.com

I am making an app that uses Core ML and ARKit.
We send machine learning / AR related information.

Twitter
Medium

U2Net to CoreML [Machine learning segmentation on iPhone]

High-precision segmentation model on mobile devices

Converted model

GitHub - john-rocky/CoreML-Models: Converted CoreML Models

Converted CoreML Models You can get the model converted to CoreML format from the link of Google drive. See the section…

Google Colab notebook demo of conversion code

Google Colaboratory

Edit description

Original project

GitHub - xuebinqin/U-2-Net: The code for our newly accepted paper in Pattern Recognition 2020…

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for…

paper

U$^2$-Net: Going Deeper with Nested U-Structure for Salient Object Detection

In this paper, we design a simple yet powerful deep network architecture, U$^2$-Net, for salient object detection…

Applications that use U2Net

Remove Background from Photo & Video Automatically

As a video and photo editor, you may come across basic editing work that requires you to change and remove the…

Pixelmator Pro

Pixelmator Pro is a powerful, beautiful, and easy to use image editor packed full of innovations.

Version

Conversion procedure

Core ML format · Issue #118 · xuebinqin/U-2-Net

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

Use the model in your Xcode project

1. When using the CoreML framework

2. When using Vision

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by MLBoy

Responses (2)