A common methodology for detecting commercials in television news is to scan for the tell-tale "fade to black" used to fade into and out of commercial breaks on most channels. It turns out that the venerable open source "ffmpeg" utility has a built-in filter for detecting such intervals called "blackdetect" (note that Debian users might need to compile a modern ffmpeg from source or use a precompiled binary). Applied to television news, it scans the entire broadcast and identifies all groups of frames in which each frame is more than the specified black threshold and the total duration of the video sequence is longer than the specified length of time.
Scanning a video for such transitions is as simple as:
ffmpeg -i ./VIDEO.mp4 -vf "blackdetect=d=0.05:pix_th=0.10" -an -f null - 2>&1 | grep blackdetect
In this example we scan for frames that are the default 0.10 pixel luminance value that defines "black" and at least 0.05 seconds in length. On a 64 core system it takes around 31 seconds to process an hour-long CNN broadcast.
For example, here are the results for the June 11, 2013 1-2PM PDT broadcast of The Lead With Jake Tapper broadcast that we previously scanned for commercials using the captioning mode field of its closed captioning:
black_start:660.927 black_end:661.194 black_duration:0.266933 black_start:663.663 black_end:665.398 black_duration:1.73507 black_start:691.024 black_end:691.09 black_duration:0.0667333 black_start:721.02 black_end:721.12 black_duration:0.1001 black_start:781.014 black_end:781.08 black_duration:0.0667333 black_start:796.029 black_end:796.095 black_duration:0.0667333 black_start:811.044 black_end:811.144 black_duration:0.1001 black_start:826.059 black_end:826.125 black_duration:0.0667333 black_start:841.04 black_end:842.942 black_duration:1.9019 black_start:1372.34 black_end:1373.34 black_duration:1.001 black_start:1382.05 black_end:1382.18 black_duration:0.133467 black_start:1403.2 black_end:1403.3 black_duration:0.1001 black_start:1418.22 black_end:1418.28 black_duration:0.0667333 black_start:1433.23 black_end:1433.37 black_duration:0.133467 black_start:1493.26 black_end:1493.36 black_duration:0.1001 black_start:1508.07 black_end:1508.54 black_duration:0.467133 black_start:1568.47 black_end:1569.8 black_duration:1.33467 black_start:1574.31 black_end:1575.27 black_duration:0.967633 black_start:2116.58 black_end:2116.75 black_duration:0.166833 black_start:2131.7 black_end:2131.83 black_duration:0.133467 black_start:2176.71 black_end:2176.84 black_duration:0.133467 black_start:2191.72 black_end:2191.82 black_duration:0.1001 black_start:2236.73 black_end:2236.87 black_duration:0.133467 black_start:2296.76 black_end:2298.56 black_duration:1.8018 black_start:2656.05 black_end:2656.42 black_duration:0.367033 black_start:2659.09 black_end:2659.22 black_duration:0.133467 black_start:2662.39 black_end:2662.46 black_duration:0.0667333 black_start:2664.9 black_end:2667.4 black_duration:2.5025 black_start:2669.1 black_end:2669.53 black_duration:0.433767 black_start:2674.34 black_end:2676.34 black_duration:2.002 black_start:2716.15 black_end:2716.25 black_duration:0.1001 black_start:2746.14 black_end:2746.24 black_duration:0.1001 black_start:2776.14 black_end:2776.27 black_duration:0.133467 black_start:2791.05 black_end:2791.52 black_duration:0.467133 black_start:2810.64 black_end:2810.84 black_duration:0.2002 black_start:2821.45 black_end:2821.59 black_duration:0.133467 black_start:2832.1 black_end:2832.23 black_duration:0.133467 black_start:2835.67 black_end:2836 black_duration:0.333667 black_start:2839.84 black_end:2839.9 black_duration:0.0667333 black_start:2845.31 black_end:2845.64 black_duration:0.333667 black_start:2851.52 black_end:2852.65 black_duration:1.13447 black_start:2857.19 black_end:2858.69 black_duration:1.5015 black_start:3061.59 black_end:3061.76 black_duration:0.166833 black_start:3091.66 black_end:3091.82 black_duration:0.166833 black_start:3121.65 black_end:3121.79 black_duration:0.133467 black_start:3151.72 black_end:3151.85 black_duration:0.133467 black_start:3196.66 black_end:3197.06 black_duration:0.4004 black_start:3226.96 black_end:3227.09 black_duration:0.133467 black_start:3256.99 black_end:3258.12 black_duration:1.13447 black_start:3262.69 black_end:3264.33 black_duration:1.63497 black_start:3449.08 black_end:3450.88 black_duration:1.8018 black_start:3480.78 black_end:3481.04 black_duration:0.266933 black_start:3510.77 black_end:3510.84 black_duration:0.0667333 black_start:3525.79 black_end:3525.92 black_duration:0.133467 black_start:3540.8 black_end:3540.87 black_duration:0.0667333 black_start:3600.83 black_end:3601 black_duration:0.166833 black_start:3605.8 black_end:3605.9 black_duration:0.1001 black_start:3615.85 black_end:3616.68 black_duration:0.834167
At first glance there are a lot of transitions identified in this list, but look specifically at the "black_duration" field which gives the length of the transition. Fades to/from commercial breaks are likely to be longer, especially a second or more. This yields the following transition points:
black_start:663.663 black_end:665.398 black_duration:1.73507 black_start:841.04 black_end:842.942 black_duration:1.9019 black_start:1372.34 black_end:1373.34 black_duration:1.001 black_start:1568.47 black_end:1569.8 black_duration:1.33467 black_start:1574.31 black_end:1575.27 black_duration:0.967633 black_start:2296.76 black_end:2298.56 black_duration:1.8018 black_start:2664.9 black_end:2667.4 black_duration:2.5025 black_start:2851.52 black_end:2852.65 black_duration:1.13447 black_start:2857.19 black_end:2858.69 black_duration:1.5015 black_start:3256.99 black_end:3258.12 black_duration:1.13447 black_start:3262.69 black_end:3264.33 black_duration:1.63497 black_start:3449.08 black_end:3450.88 black_duration:1.8018 black_start:3615.85 black_end:3616.68 black_duration:0.834167
Recall that using the closed captioning mode field, we previously derived the following commercial/news chronology for that broadcast:
Commercials Start: 667 News Resumes: 851 Commercials Start: 1375 News Resumes: 1582 Commercials Start: 2119 News Resumes: 2306 Commercials Start: 2659 News Resumes: 2864 Commercials Start: 3064 News Resumes: 3273 Commercials Start: 3455 News Resumes: 3629
Remember that closed captioning can be a few seconds delayed from the actual video since they are typed by human captioners transcribing the broadcast as they watch it, so the two sets of timecodes will be slightly off. Interweaving the two yields:
Commercials Start: 667 black_start:663.663 black_end:665.398 black_duration:1.73507 News Resumes: 851 black_start:841.04 black_end:842.942 black_duration:1.9019 Commercials Start: 1375 black_start:1372.34 black_end:1373.34 black_duration:1.001 News Resumes: 1582 black_start:1568.47 black_end:1569.8 black_duration:1.33467 black_start:1574.31 black_end:1575.27 black_duration:0.967633 Commercials Start: 2119 News Resumes: 2306 black_start:2296.76 black_end:2298.56 black_duration:1.8018 Commercials Start: 2659 black_start:2664.9 black_end:2667.4 black_duration:2.5025 News Resumes: 2864 black_start:2851.52 black_end:2852.65 black_duration:1.13447 black_start:2857.19 black_end:2858.69 black_duration:1.5015 Commercials Start: 3064 News Resumes: 3273 black_start:3256.99 black_end:3258.12 black_duration:1.13447 black_start:3262.69 black_end:3264.33 black_duration:1.63497 Commercials Start: 3455 black_start:3449.08 black_end:3450.88 black_duration:1.8018 News Resumes: 3629 black_start:3615.85 black_end:3616.68 black_duration:0.834167
Immediately clear is that the two datasets are not perfectly aligned. In areas they align exactly (adjusting for the ~10s delay of the captioning from the video), but in other areas they entirely diverge. Here are some of the key differences:
- At timecode 1582 captioning shows a transition back to news, which the video analysis shows as well, but then video shows a second transition around 5 seconds later. It turns out this is a brief stock clip of Wolf Blitzer saying "This is CNN".
- At timecode 2119, captioning shows a transition to commercial break, which the video analysis does not show. The video analysis shows a very brief transition of 0.166 seconds at this time ("black_start:2116.58 black_end:2116.75 black_duration:0.166833"). A look at the clip itself confirms that this is a commercial break, but that instead of a long pause, the video transitions to the commercial almost instantly. Most importantly, the sound from the news programming continues to play during the black fade, making it even more difficult to distinguish as a commercial break.
- At timecode 2864, there are once again two long transition periods in the video analysis. This time it is a stock clip of Erin Burnett saying "This is CNN."
- At timecode 3064, again there is near-instant transition from programming to commercial break, which captioning catches but is too brief to distinguish in the video analysis from an intra-clip fast cut.
- At timecode 3273, there are once again two long transition periods in the video analysis. This time it is a stock clip of Anderson Cooper saying "This is CNN."
Of the 12 transitions above, 3 feature the "This is CNN" double-break and two commercial breaks are missed entirely, meaning there are issues with 44% of the divisions. Of the 6 commercial breaks, this approach misses a full third of them. Thus, there are challenges to rely exclusively on blackframe detection without secondary signals. In particular, it is from the two missed commercial breaks that ALL blackframe divisions, including extremely brief ones with audio from the previous block continuing to play through the divider, must be considered.
The consistency of the "This is CNN" breaks make them readily identifiable in this particular broadcast, but it is unclear whether this remains consistent over the years or whether a varying set of such special case filters would be needed to cover the entire 2009-2020 period using blackframe detection.
Most commercial breaks are bookended by second-long black screens, but as the broadcast above shows, not all commercial breaks are marked by such lengthy divisions and in at least one case the audio from the news programming extended over part of the transition period, making it even harder to distinguish as a commercial break.
Perhaps most problematically, however, is that there is no systematic difference between the blackframe period that begins a commercial break and that which ends it. In essence, blackframes are a universal "divider" without any indication of whether the divider marks the beginning or the end of the commercial break. Most commercial breaks appear to have rapid-fire scene changes and further blackframe separation, but overall it is clear that relying on visual cues alone will yield a much reduced accuracy rate compared with the captioning.
In the end, blackframe analysis by itself appears to struggle with reliably identifying commercial breaks, whereas caption mode information offers a flawless subdivision of the broadcast. Combined with the captioning mode data the blackframe timecodes could be used to time-shift the captioning to reduce the captioning delay and more precisely align it with the video, but it is clear that caption mode is a far more reliable tool for commercial detection.