365 Data Science

How to automatically deskew (straighten) a text image using OpenCV

Today I would like to share with you a simple solution to image deskewing problem (straightening a rotated image). If you’re working on anything that has text extraction from images — you will have to deal with image deskewing in one form or another. From camera pictures to scanned documents — deskewing is a mandatory step in image pre-processing before feeding the cleaned-up image to an OCR tool.

As I myself was learning and experimenting with image processing in OpenCV, I found that in the majority of tutorials you just get a copy-pasted code solution, with barely any explanation of the logic behind it. That’s just not right. We need to understand the algorithms and how we can combine various image transformations to solve a given problem. Otherwise we won’t make any progress as software engineers. So in this tutorial I will try to keep the code snippets to bare minimum, and concentrate on explaining the ideas that make it work. But don’t worry, you can always find the complete code in my GitHub repo by the link at the end of this article.

Deskewing algorithm

Let’s start by discussing the general idea of deskeweing algorithm. Our main goal will be splitting the rotated image into text blocks, and determining the angle from them. To give you a detailed break-down of the approach that I’ll use:

Per usual — convert the image to gray scale.
Apply slight blurring to decrease noise in the image.
Now our goal is to find areas with text, i.e. text blocks of the image. To make text block detection easier we will invert and maximize the colors of our image, that will be achieved via thresholding. So now text becomes white (exactly 255,255,255 white), and background is black (same deal 0,0,0 black).
To find text blocks we need to merge all printed characters of the block. We achieve this via dilation (expansion of white pixels). With a larger kernel on X axis to get rid of all spaces between words, and a smaller kernel on Y axis to blend in lines of one block between each other, but keep larger spaces between text blocks intact.
Now a simple contour detection with min area rectangle enclosing our contour will form all the text blocks that we need.
There can be various approaches to determine skew angle, but we’ll stick to the simple one — take the largest text block and use its angle.

And now switching to python code:

# Calculate skew angle of an image
def getSkewAngle(cvImage) -> float:
    # Prep image, copy, convert to gray scale, blur, and threshold
    newImage = cvImage.copy()
    gray = cv2.cvtColor(newImage, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (9, 9), 0)
    thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

    # Apply dilate to merge text into meaningful lines/paragraphs.
    # Use larger kernel on X axis to merge characters into single line, cancelling out any spaces.
    # But use smaller kernel on Y axis to separate between different blocks of text
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (30, 5))
    dilate = cv2.dilate(thresh, kernel, iterations=5)

    # Find all contours
    contours, hierarchy = cv2.findContours(dilate, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
    contours = sorted(contours, key = cv2.contourArea, reverse = True)

    # Find largest contour and surround in min area box
    largestContour = contours[0]
    minAreaRect = cv2.minAreaRect(largestContour)

    # Determine the angle. Convert it to the value that was originally used to obtain skewed image
    angle = minAreaRect[-1]
    if angle < -45:
        angle = 90 + angle
    return -1.0 * angle

Don’t forget to give us your ? !

How to automatically deskew (straighten) a text image using OpenCV was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/how-to-automatically-deskew-straighten-a-text-image-using-opencv-a0c30aed83df?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/how-to-automatically-deskew-straighten-a-text-image-using-opencv

Feature Engineering for Numerical Data

Data feeds machine learning models, and the more the better, right? Well, sometimes numerical data isn’t quite right for ingestion, so a variety of methods, detailed in this article, are available to transform raw numbers into something a bit more palatable.

Originally from KDnuggets https://ift.tt/2RhQYiU

source https://365datascience.weebly.com/the-best-data-science-blog-2020/feature-engineering-for-numerical-data

An Introduction to NLP and 5 Tips for Raising Your Game

This article is a collection of things the author would like to have known when they started out in NLP. Perhaps it will be useful for you.

Originally from KDnuggets https://ift.tt/2GH8Y4b

source https://365datascience.weebly.com/the-best-data-science-blog-2020/an-introduction-to-nlp-and-5-tips-for-raising-your-game

Math for Programmers

Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer. Save 50% with code kdmath50.

Originally from KDnuggets https://ift.tt/2DQtb6I

source https://365datascience.weebly.com/the-best-data-science-blog-2020/math-for-programmers9289268

AI Papers to Read in 2020

Reading suggestions to keep you up-to-date with the latest and classic breakthroughs in AI and Data Science.

Originally from KDnuggets https://ift.tt/3bMo7Nc

source https://365datascience.weebly.com/the-best-data-science-blog-2020/ai-papers-to-read-in-2020

YOLO: Object Detection in Images and Videos

In another post, we explained how to apply Object Detection in Tensorflow. In this post, we will provide some examples of how you can apply Object Detection using the YOLO algorithm in Images and Videos. For our example, we will use the ImageAI Python library where with a few lines of code we can apply object detection.

Object Detection in Images

Below we represent the code for Object Detection in Images.

from imageai.Detection import ObjectDetection
import os

execution_path = os.getcwd()
detector = ObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath( os.path.join(execution_path , "yolo.h5"))
detector.loadModel()

detections = detector.detectObjectsFromImage(input_image=os.path.join(execution_path , "cycling001.jpg"), output_image_path=os.path.join(execution_path , "new_cycling001.jpg"), minimum_percentage_probability=30)

for eachObject in detections:

    print(eachObject["name"] , " : ",    eachObject["percentage_probability"], " : ", eachObject["box_points"] )

print("--------------------------------")

And we get:

car  :  99.66793060302734  :  [395, 248, 701, 405]
--------------------------------
bicycle  :  66.10226035118103  :  [81, 270, 128, 324]
--------------------------------
bicycle  :  99.86441731452942  :  [242, 351, 481, 570]
--------------------------------
person  :  99.92108345031738  :  [269, 186, 424, 540]
--------------------------------

We also represent the Original and the Detected Image

Notice that it was able to detect the bicycle behind-A-M-A-Z-I-N-G!!!

Let’s provide another example of the Original and the Detected image

Notice that it detected the bed, the laptop and the two persons!

Object Detection in Videos

Assume that you have a video in your PC called “Traffic.mp4”, then by running this code you will be able to get the detected objects:

from imageai.Detection import VideoObjectDetection
import os

execution_path = os.getcwd()
detector = VideoObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath( os.path.join(execution_path , "yolo.h5"))

detector.loadModel()

video_path = detector.detectObjectsFromVideo(input_file_path=os.path.join(execution_path, "Traffic.mp4"),

output_file_path=os.path.join(execution_path, "New_Traffic")

, frames_per_second=20, log_progress=True)

print(video_path)

And the detected video is here:

Let’s provide another example of a Video:

Object Detection using your Camera

The following examples show how we can use our USB camera for Object Detection:

from imageai.Detection import VideoObjectDetection

import os
import cv2

execution_path = os.getcwd()
camera = cv2.VideoCapture(0)
detector = VideoObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath(os.path.join(execution_path , "yolo.h5"))
detector.loadModel()

video_path = detector.detectObjectsFromVideo(camera_input=camera,

output_file_path=os.path.join(execution_path, "camera_detected_video")

, frames_per_second=20, log_progress=True, minimum_percentage_probability=30)

print(video_path)

Below I represent just a snapshot of the recorded video of my office while I was coding. As you can see it was able to detect the books of the library behind me!

Don’t forget to give us your ? !

YOLO: Object Detection in Images and Videos was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/yolo-object-detection-in-images-and-videos-7a5ae09a69b4?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/yolo-object-detection-in-images-and-videos

We all have a career in AI

AI will inevitably impact every field of study and work

Continue reading on Becoming Human: Artificial Intelligence Magazine »

Via https://becominghuman.ai/we-all-have-a-career-in-ai-c0efbf6884a2?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/we-all-have-a-career-in-ai

Image Stream Processing in Flutter application by TFLite Neural Networks

Camera image stream processing problem

Camera is a nice plugin to access hardware cameras, take some pictures and save them in a memory, while camera streaming is very heavy and quite efficient only on a medium quality level. It might be enough in some cases, but when resolution matters, streams can lag and slow down until app die. The reason is very simple: streams push huge amounts of data into each frame in the main thread. How to move those in a separate thread?

You need to register separate isolates, register plugins in those isolates and somehow handle a memory. I find it tricky.
Another solution is to write your own plugin with access to camera, handle camera streams in Android and iOS threads, and push some results on top. This is also not the easiest solution.

So I decided to work on third solution, easier and less time consuming.

Possible solution, or my own ‘wheel’

Depending on a task and requirements, there is a solution which might cause some ‘freezing’ issues, but in the end its overall result generally satisfies the requirements. When resolution matters and image update frequency is not so important (for example, from three to five frames in a second are acceptable), you can pick frames from camera, process them, show results, and perform all these steps in a cycle.

Some common camera configurations

To make this app work with a camera, first it needs to be configured. A camera controller is configured as shown or as you wish, according to your requirements, and followed by camera plugin samples.

Image capturing is also simple. Provide image url. Take a picture (frame) and save it. Here you could see some pieces of BLoC events triggering:

While the main frame picking logic is here:

I have also added some kind of cache to save 10 latest frames for processing and cleaning operation to be performed in a separate isolate. You could find that in a sample application.

Results

All described above will lead to the following results:

Working laggy. But image stream itself is even worst

Cropped images look like these ones below. Quality is good enough for some purposes, even if it not perfectly detected/cropped.

Overall scheme

Please do not be scared at scheme below. It is quite complex, but I tried to show as much as possible to give the understanding of the full flow of events.

Code sample

All sample application code is provided and could be found here:

VadPinchuk/flutter_detector

I highly recommend to check if all plugin requirements are implemented, to be on the safe side. It can be not configured for iOS, but it will work on it, as my original project showed.

Summary

As you can see from a gif (video) or from an application that you could build from the provided code sample, this solution is not perfect. It still requires some optimisation and simplification. Some improvements could also be done by moving separate logic to isolates. In general, however, this solution works much better than the camera stream itself. And for now, it can be applied for some purposes.

I hope you liked this article and it could be useful for you.

Thank you for your time.

Don’t forget to give us your ? !

Image Stream Processing in Flutter application by TFLite Neural Networks was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/image-stream-processing-in-flutter-application-by-tflite-neural-networks-2c40d65f3b67?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/image-stream-processing-in-flutter-application-by-tflite-neural-networks

6 Common Mistakes in Data Science and How To Avoid Them

As a novice or seasoned Data Scientist, your work depends on the data, which is rarely perfect. Properly handling the typical issues with data quality and completeness is crucial, and we review how to avoid six of these common scenarios.

Originally from KDnuggets https://ift.tt/2FllZji

source https://365datascience.weebly.com/the-best-data-science-blog-2020/6-common-mistakes-in-data-science-and-how-to-avoid-them

Lets Be Honest: Were Drowning in Data

The fields of Big Data, Data Analytics/Science, and Data Integration need to face a new truth: We are drowning in data, more and more so every second of every day.

Originally from KDnuggets https://ift.tt/3itH9u4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/lets-be-honest-were-drowning-in-data

365 Data Science

How to automatically deskew (straighten) a text image using OpenCV

Deskewing algorithm

Trending AI Articles:

Visualizing the steps

Side note on angle calculation

Testing

Don’t forget to give us your ? !

Feature Engineering for Numerical Data

An Introduction to NLP and 5 Tips for Raising Your Game

Math for Programmers

AI Papers to Read in 2020

YOLO: Object Detection in Images and Videos

Object Detection in Images

Trending AI Articles:

Object Detection in Videos

Object Detection using your Camera

Don’t forget to give us your ? !

We all have a career in AI

Image Stream Processing in Flutter application by TFLite Neural Networks

Camera image stream processing problem

Possible solution, or my own ‘wheel’

Trending AI Articles:

Results

Summary

Don’t forget to give us your ? !

6 Common Mistakes in Data Science and How To Avoid Them

Lets Be Honest: Were Drowning in Data