OCR Text Recognition
OCR (Optical Character Recognition) text recognition is an AI algorithm that detects text regions in images and converts the text information into digital text data. It consists of two stages — text detection and text recognition — and is utilized across a broad range of applications including license plate reading, form processing, and sign recognition.
Algorithm Overview
OCR processing consists of the following two stages:
- Text Detection (CTPN): Identifies text regions in images, detecting position and extent. Handles complex backgrounds and tilted text strings
- Text Recognition (CRNN): Converts detected text regions into text. No character-level segmentation is required; it recognizes variable-length text sequences end-to-end
Edge AI Board (RV1126B) Execution Efficiency
| Algorithm | Model Size | Processing Time |
|---|---|---|
| Text Detection (CTPN) | 3.31MB | 52ms |
| Text Recognition (CRNN) | 6.19MB | 3ms |
Key Features
- Two-stage pipeline: Efficient processing flow of detection to recognition
- Variable-length text support: End-to-end recognition of text sequences with variable character counts
- Lightweight models: Compact model sizes of 3.31MB (detection) and 6.19MB (recognition)
- High-speed recognition: Recognition processing at approximately 3ms per character
Use Cases
- Automatic license plate reading
- Automatic data entry for forms and vouchers
- Text information extraction from signs and billboards
- Serial number reading on production lines
- Business card digitization
- Meter and gauge value reading
Edge AI Board Implementation
Using the RV1126B NPU, high-speed OCR processing is achieved with text detection at 52ms and text recognition at 3ms. The entire pipeline from camera input to text output can be completed at the edge.