With our new BigQuery + Bigtable digital twin over our GCS archive, we can trivially compile ongoing inventories of our archive over time and perform whole-of-archive sweeps to identify and retry failed ingests. For example, we can use the following SQL to compile a breakdown by status code:
select status, count(1) cnt from ( select JSON_EXTRACT_SCALAR(DOWN, '$.status') status, DOWN FROM ( SELECT rowkey, ( select array(select value from unnest(cell))[OFFSET(0)] from unnest(cf.column) where name in ('DOWN') ) DOWN FROM `[PROJECTID].bigtableconnections.digtwin` where SAFE_CAST(substr(rowkey, 0, 8) as NUMERIC) > 20000000 ) ) group by status order by cnt desc
Yielding:
status | cnt |
SUCCESS | 7511136 |
FAILPERM_CORRUPT | 38507 |
FAILTMP_SRTDOWNFAIL | 5998 |
FAILPERM_AVCHECKCORRUPT | 3442 |
FAILPERM_NOEXISTS | 1029 |
FAILTMP_SRTNOTREADY | 21 |
FAILTMP_VIDDOWN | 14 |
FAILTMP_NOVALIDVIDEOFORMAT | 13 |
READY | 11 |
FAILTMP_VIDMD5MISMATCH | 11 |
FAILTMP_VIDNOEXIST | 4 |
FAILTMP_GCSRENAME | 2 |
Similarly, we can compile a by-channel breakdown:
select chan, count(1) cnt from ( select JSON_EXTRACT_SCALAR(DOWN, '$.chan') chan, DOWN FROM ( SELECT rowkey, ( select array(select value from unnest(cell))[OFFSET(0)] from unnest(cf.column) where name in ('DOWN') ) DOWN FROM `[PROJECTID].bigtableconnections.digtwin` where SAFE_CAST(substr(rowkey, 0, 8) as NUMERIC) > 20000000 ) ) group by chan order by cnt desc
This returns 327 total channels. However, at least a portion of these channels were brief monitoring experiments consisting of only a handful of shows that encountered technical issues. Thus, to arrive at the total number of channels containing at least one successfully monitored broadcast we can simply add a status filter:
select chan, count(1) cnt from ( select JSON_EXTRACT_SCALAR(DOWN, '$.chan') chan, DOWN FROM ( SELECT rowkey, ( select array(select value from unnest(cell))[OFFSET(0)] from unnest(cf.column) where name in ('DOWN') ) DOWN FROM `[PROJECTID].bigtableconnections.digtwin` where SAFE_CAST(substr(rowkey, 0, 8) as NUMERIC) > 20000000 ) where JSON_EXTRACT_SCALAR(DOWN, '$.status') = 'SUCCESS' ) group by chan order by cnt desc
This yields the ultimate total: 260 channels!