• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

  • Enterprise
    • Product
    • eBooks
    • Contact Us
  • Marketplace
    • Product
    • Sign Up
    • Docs
  • Resources
    • API Blog
    • API Tutorial
    • Developer Showcase
  • EnglishEnglish
    • 日本語日本語
You are here: Home / API Tutorial / OCR API Tutorial (OCR Text Extractor)

OCR API Tutorial (OCR Text Extractor)

February 1, 2020 By Shyam Purkayastha Leave a Comment

We have already witnessed the power of APIs to harness the capabilities of machine learning for face detection. Now it’s time for yet another interesting application of machine learning. In this post, we are going to cover the OCR Text Extractor API which can recognize and extract characters and words present in a printed document.

The OCR Text Extractor API is integrated with Rakuten RapidAPI. You can check the API console and follow along this tutorial to explore the features and give it a try.

OCR API

In case you don’t have an account on Rakuten Rapid API, sign up now and get your universal API key to access the OCR Text Extractor API and thousands of other APIs hosted on Rakuten RapidAPI.

Connect to API

Table of Contents

  • 1 Use Cases of OCR (Optical Character Recognition)
    • 1.1 Automation
    • 1.2 Data Archiving
    • 1.3 Security
  • 2 API Overview
    • 2.1 API Endpoints
    • 2.2 Pricing
  • 3 Time for Some OCR Magic
  • 4 Power Up Your OCR Use Cases
    • 4.1 Share this:

Use Cases of OCR (Optical Character Recognition)

OCR finds a lot of applications where data records have to be converted from physical form to digital storage. Almost all businesses across the world rely one some form of pen and paper based data entry for recording their processes. OCR technology spares them the manual effort in transforming the paper records into digital ledgers.

Here are some broader use cases of OCR that can be applied to any business.

Automation

OCR for Automation

Organizations are increasingly looking at automating all aspects of their business processes. The biggest hurdle in automating a process is to figure out the ways to perform an activity without human intervention. OCR is one of the ways to mimic the manual data entry (text extractor) operation performed by a human operator. OCR achieves it much faster than a typical human user and with enough constraints in place, the system can eliminate all data entry errors that the humans are prone to committing.

Data Archiving

OCR API for Data Archiving

Data archiving is closely related to automation as there is a recurring need to digitize a huge pile of paper invoices, receipts and other documents for future reference. This is also a tedious chore for human operators. It consumes time and money in equal proportions and hence there is a lot of room for OCR API based automation to expedite the archival process.

Security

Text Extraction for Security

OCR is also used for enforcement in case of certain specific applications such as license plate recognition. OCR processing of images from security cameras is used to recognize license plate numbers of vehicles. This is a great add-on feature to help security teams track down unknown vehicles entering premises.

API Overview

The OCR Text Extractor API can recognize characters from images of printed documents such as invoices and receipts. Although it is capable of recognizing handwritten text also, it is best suited for structured documents.

Take a look at the API console.

OCR API Console

API Endpoints

The endpoints supported by this API are categorized into “Reference Information” & “OCR and Text Extraction”.

GET List Ocr Engine Options

The “GET List Ocr Engine Options” endpoint is for informational purposes. It returns a few different versions of the OCR engine supported by this API. These versions primarily differ in terms of speed and accuracy of the result.

GET List Language Options

The “GET List Language Options” endpoint returns a list of languages supported by this API. The OCR Text Extractor API claims to support over 20 different languages. The list of supported languages is represented by their three letter ISO 639-2 code.

POST Extract Text From Image URI

The “POST Extract Text From Image URI” endpoint performs OCR extraction on an image pointed to by an URI. Currently it supports the JPG, PNG and GIF image formats and file sizes up to 5 MB.

POST Extract Text From Image File

The “POST Extract Text From Image File” endpoint performs OCR extraction on an image submitted in base64 binary format. All other constraints remain the same as “POST Extract Text From Image URI” endpoint.

Connect to API

Pricing

The OCR Text Extraction API is available under four subscription tiers.

OCR Text Extractor API Pricing

You can choose the BASIC option, that gives you 50 free API calls for a month.

Subscribe to the API now and we are all set to explore the OCR Text Extraction API in the next section.

Time for Some OCR Magic

To witness this OCR magic, we need an image of a printed bill.

Text Extraction Sample Image

This image is available on the Internet.

Imagine that you are the owner of this EYE OF THAI-GER restaurant. You have a traditional billing printer that generates hundreds of these bills every week and you need a way to feed them into your cloud based accounting system. Sounds like a tedious job!

With the OCR Text Extractor API you can extract the clustered characters representing the items and their prices. Let’s see how the API interprets this bill.

Select the “POST Extract Text From Image URI” endpoint in the API console and feed in the JSON input as shown below.

Text Extractor JSON Input

The JSON Body string contains all the input parameters for the API. Except for the “Uri” value, we keep everything default and replace the “Uri” value with the URI of the bill image from the Internet and trigger the API.

Connect to API

OCR Endpoint Setup

If you compare the value of “parseText” in API response, we will notice that the data is parsed in columns and stored in a single long sequence. In this way you can identify the items in the DESC column and their prices in the AMT column on the bill.

Text Extractor Parse TextHowever, a closer look at the API response will reveal an anomaly.

OCR AnamolyThe amount of “$13.00” got caught in between the sequence that contains the DESC column values. This has happened due to the close proximity with the text of the adjacent AMT column.

The API did a decent job in extracting the items and their price figures printed in the bill, except for that one glitch. As such, this problem can be solved by having a clear separation in the printed text. That’s why an effective use of OCR requires a well defined structuring of the printed document. Otherwise the parsing and interpretation of data goes haywire.

You got the basic idea of OCR based data extraction. Now you can run through all your bills and extract the data in a matter of a few minutes.

Connect to API

Power Up Your OCR Use Cases

As a next level challenge, you can write a script to parse the API response to perform some ETL (Extract, Transform and Load) operations and make the data available on a database. You are likely to face more obstacles as the API returns data in a linear sequence with no delimiters. Therefore, to build a real world application, you will have to design the structure of the print document in a way that makes parsing the API response easier.

In case you want to explore similar APIs, take a look at our APIs in the Visual Recognition category. You can also check out our Machine Learning API Collection to know more about the other APIs that support broader ML applications, including image analysis.

5 / 5 ( 1 vote )

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Reddit (Opens in new window)

Filed Under: API Tutorial

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Accelerate tech modernization

To compete in the digital age, Rakuten RapidAPI helps enterprises deploy scalable and flexible IT systems to allow for ongoing experimentation and iteration at speed.

Learn More
Try Rakuten RapidAPI for free
  • Enterprise
  • Marketplace
  • Resources
  • EnglishEnglish

© 2021 Rakuten RapidAPI. All rights reserved.