How to Extract Handwritten & Printed Text Accurately with DEEPREAD Free Form: Part 2

DoubleYard
6 min readJul 7, 2022

--

Welcome back! 👋

In this article, we will be further exploring the DEEPREAD Free Form (DRFF) integration with RapidAPI and making the most of the sample code and command line interface that we have prepared for you.

For more information on what DRFF is, check out our previous article (Part 1 of How to Extract Handwritten & Printed Text Accurately with DEEPREAD Free Form)

Given its AI OCR capabilities, DRFF is able to smartly identify both handwritten and printed text on any document that are within its supported file list:

  • Single Page PDF
  • JPG
  • JPEG
  • PNG
  • TIF
  • TIFF
  • GIF

What types of documents are suited for DEEPREAD Free Form?

Well…we’re pretty limitless if we can say so ourselves!

However, we know it might be daunting to know where to even start, so we have provided you with a number of use cases as shown below. We hope this helps you starting off your journey with DRFF.

DRFF works best with unstructured documents such as emails, handwritten notes and any form of document which you are looking to extract, regardless of handwritten or printed text.

Sample Code

While RapidAPI provides numerous documents and self help guides on using their APIs across a wide range of programming languages, we have prepared a simple python based DRFF command line tool as a sample project to help you kick off your DRFF journey!

Hopefully this will help clarify how DRFF can be used in your own project and/or organizations along with how to best navigate the response data returned. As an added bonus, we have also included a basic visualization functionality for illustrative purposes to make it easier for you to visually inspect and review the extracted text from DRFF.

The below sample images will be used in our in depth walkthrough below

Sample English Document With Mixed Medium
Japanese handwritten texts on a white background
Sample Japanese Handwritten Note

To begin, you would need to visit our GitHub Repository (repo) with your Github account. If you have yet to create your GitHub account, you can sign up here: https://github.com/signup

Within the repo itself, you can see a couple of folders and files

  • fonts
  • .gitignore
  • README.md
  • requirements.txt
  • run_freeform_samples.py

In the README section, you will see a detailed instruction on how to install using a Python script. We recommend using Python3 to run these samples easily.

To start, simply install the requirements below to begin the document processing:

python3 -m venv .env
source .env/bin/activate
pip install -r requirements.txt

The core functionality of DRFF is contained in the run_freeform_samples.py file and we would encourage you to take a look at this if you would like any additional instructions on how to use DRFF within your own project (As always, you are welcome to use the code directly as a command line tool).

Here are the commands that you can run at this stage:

  • To process a specific image: python run_freeform_samples.py -k <X-RapidAPI-Key> -f samples/form-sample-en.png
  • To process a specific image with a visualization output created: python run_freeform_samples.py -k <X-RapidAPI-Key> -f samples/form-sample-en.png — vis
  • To process all images in samples/ folder: python run_freeform_samples.py -k <X-RapidAPI-Key> — all
  • To process a specific image in Japanese: python run_freeform_samples.py -k <X-RapidAPI-Key> -f samples/sample-handwritten-ja.png -l ja

A description of the command line parameters supported and how they can be used can be found below (and in the repo itself):

Required params:

-k|--key: X-RapidAPI-Key header required to access rapidapi.

One of these are required:

--all: when selected, all images in samples/ folder will be process.

-f|--file: to specify a specific file you want to process.

Other args:

--vis: when selected, a visualisation output will be created.

-l|--language: ACCEPT-LANGUAGE value passed to rapidapi/deepread. (default en)

-h|--help: output details of command line inputs.

What are the sample outputs?

The script outputs will be sent to the outputs/ folder in your machine.

For every task, two outputs will be generated:

  1. outputs/<filename>.json: JSON output returned by DEEPREAD Free Form via RapidAPI.
  2. outputs/<filename>.<image extension>: this is an optional visualization of the processed images with side-by-side comparison of the input and output texts.

Walkthrough & Sample Output

As a special little bonus, we’ve included some sample images in the samples/ directory. This includes both Japanese and English samples which you can also see below.

Let’s walk through processing the samples/sample_form_2-en.jpg file (in English) which looks like the one below:

Sample English Document With Mixed Medium
Sample Japanese Handwritten Note

Step 1: Code check and environment set up

If you haven’t already clone the repo and install the dependencies, here they are again:

git@github.com:Edulab-NLP/deepread-rapidapi-samples.git
python3 -m venv .env
source .env/bin/activate
pip install -r requirements.txt

Before you can process the sample files with RapidAPI, you will need to have the X-RapidAPI-Key which can be found on the documentation page at RapidAPI (refer to the image below)

X-RapidAPI-Key (the key has been redacted for security purposes)

Step 2: Running the command

Once you have this, you can then run the command below by subbing in your personalized actual key for <X-RapidAPI-Key> as below:

python run_freeform_samples.py -f samples/form-sample-en.jpg --vis -k <X-RapidAPI-Key>

Step 3: Receiving your results

From here, you will see that two outputs are produced and returned:

1. outputs/samplefilename.json: json output returned by DRFF via RapidAPI which you can see in the below screenshot.

Screenshot of proportion of JSON output from DEEPREAD Free Form (outputs/samplefilename.json)

Looking at the JSON output file as above, the results are broken down into a hierarchy of distinct blocks at page, line and word levels.

Each of these text blocks contains a number of distinct entities such as;

  • Text — the text extracted by DeepRead Free Form
  • Confidence — the measure of confidence in each individual character extracted. Note: Confidence scores are only provided at a character level.
  • Coordinates — a bounding box describing the location of the extracted text in the original document/image file. Note: The bounding box is defined in terms of the [x, y, width, height] where the x and y coordinates refers to the top left corner of the bounding box.

2. outputs/samplefilename.jpg: the visualization of the extracted text and its approximate location on the input document/image.

The below images showcase the JPG output by DRFF as generated using the — vis parameter. In both cases, an image representing the text extracted by DRFF is displayed along side its original image:

Sample English DEEPREAD Free Form output (outputs/samplefilename.jpg)
Sample Japanese DEEPREAD Free Form output (outputs/samplefilename.jpg)

Now wasn’t that easy peasy? For additional support, contact us on support@deepread.ai and one of our team will get back to you shortly!

--

--

No responses yet