The GDELT Project

Using Thumbnail Montages To Optimize AI-Based OCR Speed & Costs: Part 5 – The Impact Of Grid Layout

Continuing our OCR experiments, how much of an impact does the layout of the montage grid have on OCR performance? From a downstream processing standpoint a strictly vertical montage, with images arranged in a single long column is the ideal, but to what degree does this introduce any form of increased OCR error? Comparing vertical, horizontal and grid montages, while there are slight differences between each, overall we get highly similar results – a reassuring result, though the grid montage is a bit less accurate anecdotally. Remarkably, Cloud Vision API is able to even recognize the title of two books in a bookcase behind a guest speaker, one of which is arranged vertically.

Let's take the same CNN broadcast we've been using for our experiments thus far, sample it at 1fps and arrange the first 9 frames in three grids: a single horizontal row of 9 frames, a single vertical column of 9 frames and an even 3×3 grid to assess the impact (if any) of the three layouts. In theory, Cloud Vision should perform identically on the 3 layouts:

time montage ./1FPSFRAMES/OUT-%06d.jpg[1-9] -tile 1x -geometry +0+10 -background black ./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-1x9tile.montage.jpg
time montage ./1FPSFRAMES/OUT-%06d.jpg[1-9] -tile 3x3 -geometry +0+10 -background black ./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-3x3tile.montage.jpg
time montage ./1FPSFRAMES/OUT-%06d.jpg[1-9] -tile 9x -geometry +0+10 -background black ./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-9x1tile.montage.jpg
identify ./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-*tile.montage.jpg
./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-1x9tile.montage.jpg JPEG 1280x6660 1280x6660+0+0 8-bit sRGB 1.95103MiB 0.000u 0:00.000
./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-3x3tile.montage.jpg JPEG 3840x2220 3840x2220+0+0 8-bit sRGB 1.95157MiB 0.000u 0:00.000
./CNNW_20240903_230000_Erin_Burnett_OutFront.fullres1fps-9x1tile.montage.jpg JPEG 11520x740 11520x740+0+0 8-bit sRGB 2050730B 0.000u 0:00.000

You can see the resulting images below:

 

Let's compare their OCR results from Cloud Vision. Obviously the ordering of the text will vary from montage to montage given their different layouts, but does it overall match?

The 1×9 vertical layout:

FRENCH PARTS\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nFRENCH PAISTIS\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nFRENCH PAINTIN\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nFRENCHL PAINTIN\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\n3:59 PM PT\n\u003eDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE SITUATION ROOM\na\nFRENCH PAINTIN\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nAFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE SITUATION ROOM\nTAST T\nPAINT\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nPRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE T SITUATION ROOM\nPRESCIL PAINT\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\n[ SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHE SITUATION ROOM\nPRESOR PAINTU\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\nDOW 626.15\nAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHEN HE WAS SITUATION ROOM\nPRESOR PAR\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\nDOW 626.15\nOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHEN HE WAS ASKED WH SITUATION ROOM

The 3×3 horizontal layout:

Via Skype\nGeneva, Switzerland\n12:59 AM\nFRENCH PAISTIS\nVia Skype\nGeneva, Switzerland\n12:59 AM\nFRENCH PAINTIN\nVia Skype\nGeneva, Switzerland\n12:59 AM\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nFRENCH PARTS\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nHENET WAS\nPRENCH PAINTIN\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nca\nPAINT\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN ISAHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\n\u003eDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\n3:59 PM PT\nAFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE SITUATION ROOM\nNEW TONIGHT\nLIVE\nCNN\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nPRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE T SITUATION ROOM\n3:59 PM PT\nFRENCHL PAINTIN\nVia Skype\nGeneva, Switzerland\n12:59 AM\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\n[ SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHE SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\nDOW 626.15\nAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHEN HE WAS SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCNN\nDOW 626.15\nOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHEN HE WAS ASKED WH SITUATION ROOM\nPRESCIL PAINT

The 9×1 horizontal layout:

Via Skype\nGeneva, Switzerland\n12:59 AM\nVia Skype\nGeneva, Switzerland\n12:59 AM\nFRENCH PAINTIN\nVia Skype\nGeneva, Switzerland\n12:59 AM\nTIENPEROU\nFRENCH PAINTIN\nVia Skype\nGeneva, Switzerland\n12:59 AM\nHENET PRAY\nFRENCH PAINTING\nVia Skype\nGeneva, Switzerland\nea\n12:59 AM\nH PAINT\nSYDNEY\nVia Skype\nGeneva, Switzerland\n12:59 AM\nPRESCIL PAINT\nVia Skype\nGeneva, Switzerland\n12:59 AM\nPRESORY PAINT\nVia Skype\nGeneva, Switzerland\n12:59 AM\nVia Skype\nGeneva, Switzerland\n12:59 AM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nNEW TONIGHT\nCAN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nNEW TONIGHT\nCAN\n3:59 PM PT\nAHU AT ODDS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN C SITUATION ROOM\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nNEW TONIGHT\nCAN\n3:59 PM PT\nODS AGAIN AFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE SITUATION ROOM\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\nAFTER US PRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE SITUATION ROOM\nNEW TONIGHT\nLIVE\nCNN\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nPRESIDENT SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE T SITUATION ROOM\n3:59 PM PT\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\n3:59 PM PT\n[ SAYS ISRAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHE SITUATIONROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\nDOW 626.15\nAELI PM NOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHEN HE WAS SITUATION ROOM\nNEW TONIGHT\nWORLD HEALTH ORGANIZATION SAYS GAZA POLIO\nVACCINATION CAMPAIGN IS AHEAD OF TARGETS\nLIVE\nCAN\nDOW 626.15\nOT DOING ENOUGH IN CEASEFIRE-HOSTAGE RELEASE TALKS. WHEN HE WAS ASKED WH SITUATION ROOM\nFRENCH PARTS\nFRENCH PAISTIS

How might we compare these three blocks of text, given that their respective ordering is necessarily different due to the layout of the images? One way is to notice the appearances of "French" in the three texts, which appears five times in the horizontal and vertical layouts: "FRENCH PARTS, FRENCH PAISTIS, FRENCH PAINTIN, FRENCHL PAINTIN, FRENCH PAINTIN", but just four times in the 3×3 grid. Where does the word "French" even appear in the images? This took awhile to spot, but upon closer inspection, look at the bookshelf at bottom left of the guest speaker, third book from the top of the shelf, titled French Painting. The ability of Cloud Vision to pick out a book title on a bookshelf behind a guest speaker is truly remarkable.

All three layouts have subtle differences. In the fourth frame, the word "odds" in the crawl is scrolling off the screen and thus the letter "o" is clipped. The vertical and grid layouts render it as "\u003eDS AGAIN" (\u003e is the Unicode escape for a greater-than sign and thus the text is ">DS AGAIN", while the horizontal layout renders as "ODS AGAIN". In contrast, "AHU AT ODDS" appears the same in all three, as does "\nAELI PM" (again, all of these represent frames where text from the bottom crawl is cut off at the left side as it scrolls off the screen).