Continue Reading

WashPost: Republicans Keep Spilling Cold Water On Their Biden Bribery Allegations

The Washington Post's Philip Bump examines television news coverage of the Biden bribery allegations. Read The Full Article.

Continue Reading

Generative AI: Using LLMs To Produce Culturally Recent Translations Vs Classical NMT – "Dropping" A Song

Last month we explored the ability of Large Language Models (LLMs) to produce higher-quality translations than traditional Neural Machine Translation…

Continue Reading

WashPost: DeSantis's Campaign Launch Fizzled

The Washington Post's Philip Bump explores media coverage of DeSantis' campaign launch. Read The Full Article.

Continue Reading

Fox News & Last Night's "Wannabe Dictator" Chyron

Last night, Fox News ran briefly ran the chyron "WANNABE DICTATOR SPEAKS AT THE WHITE HOUSE AFTER HAVING HIS POLITICAL…

Continue Reading

Generative Search: The Curious Case Of Comet Cleaner's Active Ingredient

The use of LLMs to interpret and summarize search results (so-called "generative search") is widely touted as the future of…

Continue Reading

Embedding Models: Multilingual Embedding Versus Machine Translation + English Embedding

There are at least 7,000 languages actively spoken today across the world, yet much of the focus of embedding models…

Continue Reading

Embedding Models: Mitigating Knowledge Cutoffs Through Replacement Terms

Embedding models represent a snapshot in time of world knowledge. Like knowledge graphs, LLMs and all other forms of machine…

Continue Reading

Embedding Models Vs Classical Sentiment Analysis For Tone & Framing Search

Yesterday we examined the significant limitations of using embeddings to search by tone and framing, demonstrating that embeddings are not…

Continue Reading

Weaponized: Russian Propaganda Outlets Promote Presidential Candidate Robert F. Kennedy Jr.

This fascinating analysis by Caroline Orr Bueno examines media coverage of Russian outlets about Robert F. Kennedy Jr. Read The…

Continue Reading

Embedding Models: The Impact Of Tone And Framing On Embedding Search

Continuing our embedding series, how strongly are tone and framing captured in embedding representations and are they sufficient to bias…

Continue Reading

Embedding Models: The Unique Challenges Of News & The Impact Of Buried Facts On Embeddings As External LLM Memory

Embedding models form the basis of most current approaches to overcoming the input length and knowledge aging of large language…

Continue Reading

Embedding Models: Using LLMs To Create Synthetic Comparison Data & The Impact Of Textual Length Part 4

Repeating our analysis from earlier today, this time we'll use an LLM to generate passages in three lengths (tweet, long-form…

Continue Reading

Embedding Models: Using LLMs To Create Synthetic Comparison Data & The Impact Of Textual Length Part 3

As we continue our embedding series, we've demonstrated that the length of the input text can have an impact on…

Continue Reading

Embedding Models: The Impact Of Textual Length On Embedding Similarity Part 2

As embeddings play an increasingly central role in semantic search and as LLM external memory, one challenge is that despite…

Continue Reading

Embedding Models: The Impact Of Length On Embedding Similarity

Embeddings are highly sensitive to input length, where highly similar texts can yield very different embeddings depending on their size….

Continue Reading

Embedding Models: Clustering COVID-19 Versus "Poxes"

Following on our "mpox" versus "monkeypox" experiment, let's use that same embedding visualization template to cluster a broader set of…

Continue Reading

Embedding Models: Revisiting Multilingual Embedding Through Visualization

Using our new embedding visualization template, let's revisit our multilingual embedding experiment and visualize how each embedding model clusters our…

Continue Reading

Embedding Models: Capitalization & Knowledge Cutoffs Part 2

Yesterday we introduced a Colab notebook template for visualizing embedding models and explored the impact of capitalization, word spacing and…

Continue Reading

A Template For Visually Comparing Embedding Models + Exploring Capitalization, Spacing & Knowledge Cutoffs

Embeddings are designed to look beyond the words on a page to the semantic concepts they represent, allowing a search…

Continue Reading

Visual Explorer: New Offset Referencing For Archive URL Alignment

Earlier this month we introduced direct ImageID referencing to the Visual Explorer, allowing you to specify a specific ImageID for…

Continue Reading

Multilingual Embedding For LLM External Memory & Semantic Search: Universal Sentence Encoder Family, LaBSE & Vertex AI Embeddings for Text

Embeddings have emerged in recent years as the go-to approach for semantic search. With the rise of Large Language Models…

Continue Reading

Generative AI: Translation APIs Versus LLM For Social Media Translation

Large Language Models can perform a wide array of tasks, including textual translation. Given the widespread availability of existing dedicated…

Continue Reading

The Predictability Of Global Events: How An LLM Predicted The Outcome Of Türkiye's Presidential Election

With the Turkish presidential election this past weekend, we wanted to test how well some of the major commercial Large…

Continue Reading

The Language Bias Of Large Language Models: Why Myanmar Is 1300% More Costly Than English In GPT-3

Large Language Models (LLM's) like OpenAI's ChatGPT and Google's Bard interpret language not as discrete words, but as word-parts known…