Skip to content

OCR Text Recognition

OCR (Optical Character Recognition) text recognition is an AI algorithm that detects text regions in images and converts the text information into digital text data. It consists of two stages — text detection and text recognition — and is utilized across a broad range of applications including license plate reading, form processing, and sign recognition.

Algorithm Overview

OCR processing consists of the following two stages:

  1. Text Detection (CTPN): Identifies text regions in images, detecting position and extent. Handles complex backgrounds and tilted text strings
  2. Text Recognition (CRNN): Converts detected text regions into text. No character-level segmentation is required; it recognizes variable-length text sequences end-to-end

Edge AI Board (RV1126B) Execution Efficiency

AlgorithmModel SizeProcessing Time
Text Detection (CTPN)3.31MB52ms
Text Recognition (CRNN)6.19MB3ms

Key Features

  • Two-stage pipeline: Efficient processing flow of detection to recognition
  • Variable-length text support: End-to-end recognition of text sequences with variable character counts
  • Lightweight models: Compact model sizes of 3.31MB (detection) and 6.19MB (recognition)
  • High-speed recognition: Recognition processing at approximately 3ms per character

Use Cases

  • Automatic license plate reading
  • Automatic data entry for forms and vouchers
  • Text information extraction from signs and billboards
  • Serial number reading on production lines
  • Business card digitization
  • Meter and gauge value reading

Edge AI Board Implementation

Using the RV1126B NPU, high-speed OCR processing is achieved with text detection at 52ms and text recognition at 3ms. The entire pipeline from camera input to text output can be completed at the edge.