New features of Create ML. Style Transfer.

Try new features of WWDC2020

Style Transfer has been added to Create ML. It’s the first image output model by Create ML.
I recommend anyone who has done Style Tranfer to try this handy tool.
Because there are some glorious conveniences!

Style Images.
Stylized Videos.

A fast model for Video Style Transefer has been added.

The procedure is simple.

Prepare

・One style image
・One verification image to check the training progress
・Multiple content images

Start learning with the Train button and complete in a few minutes.

conveniences

・Number of iterations… The default 500. It’s enough number for common cases, I think.
Both style loss and content loss will be completed before 100 iterations, but more iterations will give better results (I feel like that). Each GIF sample had 1200 iterations. Even though, it’s need only 10 minutes.
・If the strength is increased, the effect of the style image on the color and the edge will increase.
・When the density is increased, the pattern of the thin area of the style image is applied, and when it is weakened, the pattern of the wide area is applied.

(Strength 5, density 5: default) / (strength 10, density 5) / (strength 5, density 10) *Video model

The first GIF samples have been applied the default.

An intermediate image is displayed every 5iteration during learning.

You can save the model at the time of learning by pressing the Snapshot button.
The snapshot model can be previewed with any video or image, and can be output as mlmodel.

You can download 600 landscape images called Natural Content Datasets.
I trained on it.

Video model is 595KB
Image model is 6.5MB

Difference between Video model and Image model.

The Image takes about 10 times to converge the training (although it takes about 100 minutes, so it takes a few minutes).
I didn’t understand the difference in the result. (Google Drive: : Right image model)

Left:video model / Right:image model

・When I made a flip video with UIView, Video was still smoother.
The CPU Usage when eating Capture Output is 10–20% for video, less than 10 for image.
Memory usage was 60MB for video, 250MB for image and 4 times larger for image.
(Using iPod Touch)

Impression

It’s easy to operate with the GUI, and you can see the results immediately, so it was a stress-free experience no matter how many times you trained, adjusting options and changing images. It’s fun to try different styles. I think that such trial and error is important when making apps with image-based models.

Use models in iOS projects

Input and output of model are color image of 512 pixels.
It can be processed by Vision’s VNCoreMLRequest as usual.
You can apply capture outputs of AVFoundation to the model as CVPixelBuffer(CMSampleBufferGetImageBuffer).

lazy var coreMLRequest:VNCoreMLRequest = {
let model = try! VNCoreMLModel(for: windStyle().model)
let request = VNCoreMLRequest(model: model, completionHandler: self.coreMLCompletionHandler0)
return request
}()
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
currentBuffer = pixelBuffer

let exifOrientation = exifOrientationFromDeviceOrientation()
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer:currentBuffer!, orientation: exifOrientation, options: [:])
do {
try imageRequestHandler.perform([self.coreMLRequest])
} catch {
print(error)
}
}

The result will be returned in PixelBuffer.

To render as Video at high speed, you have to use Metal etc.

I didn’t study Metal, but made a flip video in UIView.
Even if all the Capture Output was eaten, the cpu usage rate was 10%, so I think the model processing is lightweight.

Follow me on Twitter. Please.
https://twitter.com/JackdeS11

I am making apps that uses Core ML, Create ML, and Vision.
If you would like to use a machine learning model on an edge device (iOS), please contact us by email.
rockyshikoku@gmail.com

And clap your hands.

Chao!

--

--

--

Freelance iOS developer. You can ask me for a job from any country. rockyshikoku@gmail.com https://github.com/john-rocky https://twitter.com/JackdeS11

Love podcasts or audiobooks? Learn on the go with our new app.

Localization and Internationalization with Flutter with EasyLocalization

How To Find The Best Freelance Developer For Android App

One-Stop Azure Cert Guide for a GCP Professional: Microsoft Azure Certification DP-200…

Apache Kafka — An Introduction

Cisco Cloud Services Router (CSR) 1000v RESTful API won’t start

Liquid Swap Protocol (LIQUID)

DynamoDB: Exercise 3.1: Get/Add Items to DynamoDB Tables

Why Checkpoint Meetings are a Better Alternative to Sprint Planning Meetings

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
MLBoy

MLBoy

Freelance iOS developer. You can ask me for a job from any country. rockyshikoku@gmail.com https://github.com/john-rocky https://twitter.com/JackdeS11

More from Medium

3D Object Detection for Mobile AR using MediaPipe and WebXR

Property Graph Schema…n-to-n

Character-Centered Video Story Understanding with Hierarchical QA

An example of DramaQA dataset which contains video clips, scripts, and QA pairs with levels of difficulty