read

Starting with text detection

Deliverables for my first week of working with Red Hen involved getting the initial text detection part up and running. I am current using the pre-trained deep learning model EAST: An Efficient and Accurate Scene Text Detector. The EAST pipeline is capable of predicting words and lines of text at arbitrary orientations on 720p images, and furthermore, can run at 13 FPS. More details on EAST can be found here.

Deliverables for week 1

  1. Detection of all regions of the text from a news video
  2. Create a pipeline that automatically reads videofiles stored on the server and extract the regions of text
  3. Basic OCR using tesseract4.0

In my implemetation, I tried performing the OCR on each region of detected text as it was being detected. This resulted in a jumbled output as the text regions were being detected in a random order. This problem was solved by ordering the detected regions of text by their cooridinated on the screen.

Steps to follow:

  1. Detect regions of text from each frame
  2. Extract and save all the detected regions of interest in a separate folder. The saved images are named accoriding to their x and y coordinate positions (Ex x.y.jpg) in the frame
  3. The OCR script orders the files in ascending order, reads them one at the time and writes the OCR output to an output file.
Blog Logo

Poulami Sarkar


Published

Image

blog

Red Hen Lab bengali and Hindi OCR

Back to Overview