Video AI: Motion OCR Recovery Of Damaged Text

Google's Cloud Video API is extremely capable at recovering even highly degraded onscreen text from television news broadcasts. Historically, OCR of onscreen television text was performed by converting video into a sequence of still images, often one per second, and OCR'ing those as still images. The problem with such approaches is that if the still image is captured during a scene transition, the underlying text can be unrecoverable and yield error-filled results that degrade the overall analytic environment.

In contrast, video understanding systems are able to leverage the motion environment of video to see onscreen text not as an independent instant in time, but rather as a sequence deriving from the preceding footage and leading to the succeeding footage. In this way, the Cloud Video API was able to properly OCR the chyron at the top of this page as "NO MORE CARS" despite the station using a lengthy animated transition into the word "CARS" that with traditional still image OCR yields gibberish.