Behind The Scenes: Identifying Failed Recordings: Why Traditional Video Verification Tools Fail

This past December we explored a range of approaches for detecting failed and corrupted television news recordings, from curation metadata to duration verification to Large Multimodal Models, with none of those approaches robustly detecting failed recordings at scale. What of more traditional video analysis tools designed to flag technical errors in video files? Unfortunately, the majority of these "corrupted" broadcasts actually contain technically valid audio and video streams – the MPG/MP4 files themselves are not corrupt, so standard error detection tools report no errors. Indeed, all of the metadata fields reported by tools like ffprobe check out. Instead of valid content in a corrupted file container, the issue is corrupted content in a valid format container: impossible to detect under most circumstances. As our LMM detection efforts demonstrate, even state of the art AI models like ChatGPT and Gemini are unable to robustly detect these videos.  Even harder to detect are truly corrupt files like this 1TV broadcast, for which ffprobe outputs all of the standard metadata fields with just a few subtle changes. In fact, we find that even for videos which have technical format errors or indeed outright corruption of the container format, the errors often manifest themselves in ways that traditional validation tools fail to detect, with all of the tools we have tested to date failing to detect the majority of the corrupted files. Over time we will continue to return to this challenge and already have some approaches utilizing perceptual hashes, color and texture density assessments, embeddings and some other classical pre-AI approaches that are yielding preliminary results of great interest.