Continue Reading

Gemini 1.5 Pro's 1 Million Token Model: Summarizing In A Single Prompt A Full Week Of A Russian Television News Channel

Earlier today we leveraged the million-token context window of Google's new Gemini 1.5 Pro foundation model to summarize an entire…

Continue Reading

Gemini 1.5 Pro's 1 Million Token Model: Summarizing One Full Day Of A Russian Television News Channel

The release of Google's new Gemini 1.5 Pro marks the first production foundation model LLM to achieve a context window…

Continue Reading

Gemini 1.5 Pro's 1 Million Token Model: Testing Its "Needle In A Haystack" Performance Tracking Biden Mentions On Russian TV

One of the most exciting elements of Google's new Gemini 1.5 Pro foundation model is its million-token context window coupled…

Continue Reading

Gemini 1.5 Pro's 1 Million Token Model: Summarizing Half A Day Of A Russian Television News Channel

With the release of Google's new Gemini 1.5 Pro model, for the first time we have a publicly accessible production-grade…

Continue Reading

Gemini 1.5 Pro's 1 Million Token Model: Summarizing An Evening News Broadcast

Last July we explored the ability of Anthropic's Claude 2 model and its then-novel 100,000 token limit to summarize an…

Continue Reading

Gender & Race In LMMs: How GPT-4 & Gemini 1.5 Pro Describe Doctors & CEOs

This past October we did a deep dive into multimodal image embedding models and the racial and gender biases they…

Continue Reading

Generative AI Experiments: Comparing GPT-4 & Gemini 1.5 Pro For Visually Describing Television News Broadcast Frames

How do two of the largest LMM models, GPT-4 and Gemini 1.5 Pro, compare in describing a set of still…

Continue Reading

The Brittleness Of LMM Computer Vision Models: Gemini 1.5 Pro's Hallucinated Descriptions Of Imagen 2 Images

Despite their seemingly human-like ability to understand and textually describe images, Large Multimodal Models (LMMs) like GPT-4 and Gemini 1.5…

Continue Reading

Generative AI Experiments: Using GPT-4 And Gemini 1.5 Pro To Analyze Another DALL-E Image

Continuing our experiments on LMM textual descriptions of generative AI imagery, let's test how GPT-4 and Gemini 1.5 Pro describe…

Continue Reading

The Dangers Of Image GenAI: Both GPT-4 & Gemini 1.5 Pro Thought An Imagen 2 Image Was A Real Photograph

Continuing our experiments from yesterday on LMM textual descriptions of generative AI imagery, let's test how GPT-4 and Gemini 1.5…

Continue Reading

Generative AI Experiments: Using GPT-4 And Gemini 1.5 Pro To Analyze A DALL-E Image

Last week we used DALL-E to visualize a "television news archive". Let's explore how two leading LMM's, GPT-4 and Gemini…

Continue Reading

WashPost: Which Came First, The Biden Age Concerns Or The Coverage Of Them?

The Washington Post's Philip Bump examines media coverage of age concerns about President Biden. Read The Full Article.

Continue Reading

Generative AI Experiments: "Where Is This Image" & The Critical Importance Of Tuning Models Against Over-Confidence

From their text-only roots as LLMs (Large Language Models), most major GenAI vendors now offer LMM (Large Multimodal Model) APIs…

Continue Reading

Generative AI Experiments: Using GPT-4 And Gemini 1.5 Pro To Analyze Imagen 2 Images

With the public availability of Gemini 1.5 Pro, let's compare how two major LMM's (GPT-4 and Gemini 1.5 Pro) describe…

Continue Reading

How Has Valentine's Day Been Covered On Television News Over The Past Decade?

How has Valentine's Day been covered on television news over the past decade? As the timeline below captures, mentions have…

Continue Reading

Visual Explorer: Another 30 Million Minutes Transcribed Through Google's Chirp Transcription Model

Our massive collaborative initiative to transcribe the entire Internet Archive Television News Archive added another 30 million minutes of transcribed broadcasts…

Continue Reading

Generative AI Experiments: More Experiments In Image Describing Coming Soon

Given the rapid advances in Large Multimodal Modal (LMM) models, stay tuned for a forthcoming series revisiting how far these…

Continue Reading

Generative AI Experiments: Debugging A Networking Issue With GenAI Copilots Vs Stack Overflow & GitHub

Recently, we were forced to diagnose and address an extremely specialized edge case networking issue with a third party utility…

Continue Reading

Generative AI Experiments: GenAI Coding Copilots: Asking GPT-4 & Gemini Ultra To Help Brainstorm A Trivial Video Quality Filter

The wonderful world of digital video is a vast, complex, nuanced and often arcane landscape of containers, formats, codecs, bitrate,…

Continue Reading

Generative AI Experiments: GenAI Coding Copilots: More Networking Code Troubles

As we continue to evaluate the capabilities of advanced Generative AI coding copilots, we find that they offer reasonable performance…

Continue Reading

AI In Production: A Deep Dive Into The Costs Of Multimodal Embedding Search Over 3 Billion Images

As we continue our behind-the-scenes series looking at AI technologies in real world production use cases, we've been estimating the…

Continue Reading

Behind The Scenes: Building Resilient Infrastructure Through Hard Timeouts

One of the quickest lessons developers learn building true global-scale production applications is how quickly systems break down under extreme…

Continue Reading

Generative Image AI: Asking DALL-E To Visualize A "Television News Archive"

Continuing our generative image AI series, we decided to ask DALL-E just what a "television news archive" looks like. Here…

Continue Reading

Experiments With Speech Transcription: Classical Versus LSM Speech Transcription – An English Accent Example

Here is a fascinating brief example of just how much of an improvement Large Speech Models (LSMs) offer over classical…