Reinvent the wheel by disassembling and combining the library-Document Scanner

Create a function to detect placed documents and straighten them

I want to automatically detect placed documents and correct the angle of them

If you take a picture of the placed document with a camera, it may be slanted.
It would be nice if there was a function that would automatically fix it.

So, on iOS, it can already be achieved with a framework called VisionKit.

For that matter, the iPhone’s default camera also comes with a document scanner with OCR. very.

Reinvented by combining existing ones.

By combining the image transformation library and Vision’s rectangle detection, it seems that something like a document scanner can be reinvented.

Find the core of the library

The core of the image transformation library above is a Function that receives four points in the image and returns a shaped rectangular image.
You can see that the image formatted from 4 points is created with a filter called CIPerspectiveCorrection.

If you give the coordinates of the quadrangle detected by Vision to this Function, it seems that you can cut out by automatic detection.

I was able to do it.

If you look at the code in the library, you will find something

You may discover something by looking at the code in the library or combining it.

I uploaded the code to GitHub.

I also put it in a collection of image filters called Semantic Image.


I’m a freelance engineer.
Work consultation
Please feel free to contact us with a brief development description.

I am making an app that uses Core ML and ARKit.
We send machine learning / AR related information.





Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store