Text extraction from image using Optical character recognition

Abstract

How many of us would have wanted to get information from a YouTube video or pdf or any image. This text extraction from images program helps us achieve this. You just have to give the image and the program will return whatever text is present in the image. Optical Character Recognition helps us achieve it in a very easy way.

Algorithm Description

EasyOCR:

As the name suggests EasyOCR is the most easy and efficient way to read text from an image. OCR refers to Optical Character Recognition which helps you to read text from an image directly. We might have felt the need to copy the text from youtube but there is no such option on it. Using OCR, we can get the text by passing the screenshot and retrieve the text. EasyOCR works on more than 50 languages including Hindi, Russian and many more.

To read more about EasyOCR, you can refer to this link https://www.analyticsvidhya.com/blog/2021/06/text-detection-from-images-using-easyocr-hands-on-guide/

DOWNLOAD BASE PAPER

http://ceur-ws.org/Vol-2870/paper15.pdf

How to Execute?

Make sure you have checked the add to path tick boxes while installing python, anaconda.

Refer to this link, if you are just starting and want to know how to install anaconda.

If you already have anaconda and want to check on how to create anaconda environment, refer to this article set up jupyter notebook. You can skip the article if you have knowledge of installing anaconda, setting up environment and installing requirements.txt

Install necessary libraries from requirements.txt file provided.

Go to the directory where your requirement.txt file is present.

cd <<directory of your file>>. E.g, If my file is in d drive, then

cd d:
cd d:\License-Plate-Recognition-main #CHANGE PATH AS PER YOUR PROJECT, THIS IS JUST AN EXAMPLE

If your project is in c drive, you can ignore step 1 and go with step 2.

Eg. cd C:\Users\Hi\License-Plate-Recognition-main #CHANGE PATH AS PER YOUR PROJECT, THIS IS JUST AN EXAMPLE

Run command pip install -r requirements.txtor conda install requirements.txt (Requirements.txt is a text file consisting of all the necessary libraries required for executing this python file. If it gives any error while installing libraries, you might need to install them individually.)

All the necessary files will get downloaded. To run the code, open anaconda prompt. Go to virtual environment if created or operate from the base itself and start jupyter notebook, open folder where your code is present.

Open “easyocr.ipynb” to get the results.

Data Description

No dataset used. For testing the accuracy of our OCR reader, I have given some images.

Final Results

You can also run the “main.py” file to get the results. A tkinter window will be opened where you will have to select the file you want to perform OCR on and will get the results.

Note: While selecting files select option of all files, to be able to access images.

Issues you may face while executing the code

Go to the current working directory (path of your project) to run main.py
Ensure you have all libraries installed.
EasyOCR can sometimes create version mismatches. If you get any errors, they will mostly be based on version mismatch. Browse the web and set it accordingly.

Note:

All the required data has been provided over here.

Click Here For The Source Code And Associated Files.

TechieYan Technologies

Text extraction from image using Optical character recognition

Abstract

Algorithm Description

How to Execute?

Data Description

Final Results

Issues you may face while executing the code

we will assist you 24/7

Quick Contact

Useful Links

Free Resources