Embedding Models: The Impact Of Tone And Framing On Embedding Search
Continuing our embedding series, how strongly are tone and framing captured in embedding representations and are they sufficient to bias…
Embedding Models: The Unique Challenges Of News & The Impact Of Buried Facts On Embeddings As External LLM Memory
Embedding models form the basis of most current approaches to overcoming the input length and knowledge aging of large language…
Embedding Models: Using LLMs To Create Synthetic Comparison Data & The Impact Of Textual Length Part 4
Repeating our analysis from earlier today, this time we'll use an LLM to generate passages in three lengths (tweet, long-form…
Embedding Models: Using LLMs To Create Synthetic Comparison Data & The Impact Of Textual Length Part 3
As we continue our embedding series, we've demonstrated that the length of the input text can have an impact on…
Embedding Models: The Impact Of Textual Length On Embedding Similarity Part 2
As embeddings play an increasingly central role in semantic search and as LLM external memory, one challenge is that despite…
Embedding Models: The Impact Of Length On Embedding Similarity
Embeddings are highly sensitive to input length, where highly similar texts can yield very different embeddings depending on their size….
Embedding Models: Clustering COVID-19 Versus "Poxes"
Following on our "mpox" versus "monkeypox" experiment, let's use that same embedding visualization template to cluster a broader set of…
Embedding Models: Revisiting Multilingual Embedding Through Visualization
Using our new embedding visualization template, let's revisit our multilingual embedding experiment and visualize how each embedding model clusters our…
Embedding Models: Capitalization & Knowledge Cutoffs Part 2
Yesterday we introduced a Colab notebook template for visualizing embedding models and explored the impact of capitalization, word spacing and…
A Template For Visually Comparing Embedding Models + Exploring Capitalization, Spacing & Knowledge Cutoffs
Embeddings are designed to look beyond the words on a page to the semantic concepts they represent, allowing a search…
Visual Explorer: New Offset Referencing For Archive URL Alignment
Earlier this month we introduced direct ImageID referencing to the Visual Explorer, allowing you to specify a specific ImageID for…
Multilingual Embedding For LLM External Memory & Semantic Search: Universal Sentence Encoder Family, LaBSE & Vertex AI Embeddings for Text
Embeddings have emerged in recent years as the go-to approach for semantic search. With the rise of Large Language Models…
Generative AI: Translation APIs Versus LLM For Social Media Translation
Large Language Models can perform a wide array of tasks, including textual translation. Given the widespread availability of existing dedicated…
The Predictability Of Global Events: How An LLM Predicted The Outcome Of Türkiye's Presidential Election
With the Turkish presidential election this past weekend, we wanted to test how well some of the major commercial Large…
The Language Bias Of Large Language Models: Why Myanmar Is 1300% More Costly Than English In GPT-3
Large Language Models (LLM's) like OpenAI's ChatGPT and Google's Bard interpret language not as discrete words, but as word-parts known…
Connecting The TV Explorer & Visual Explorer For Seamless CSPAN Search & Legislative Deep Linking
Earlier this week we unveiled a powerful new interface to CSPAN as part of our Visual Explorer Lenses initiative. When…
Semantic Narrowing: Towards A Calculable Descriptive Statistic Associated With Press Freedom And Authoritarianism
This paper proposes a novel measure for operationalizing authoritarianism: the narrowing of semantic dispersion. This paper defines semantic dispersion as…
Tomorrow: Connecting The TV Explorer & Visual Explorer
Stay tuned for a major new announcement about our work creating new interface metaphors connecting the TV Explorer and Visual…
Language Models Can Improve Event Prediction By Few-Shot Abductive Reasoning
Large language models have shown astonishing performance on a wide range of reasoning tasks. In this paper, we investigate whether…
CJR: How The Media Is Covering ChatGPT
The Two Center has an article in today's Columbia Journalism Review (CJR) exploring how the media has covered ChatGPT and…
Tracking a Year of Tucker Carlson on Russian TV
Until his departure from Fox News last month, Tucker Carlson was a regular fixture on Russian television news, with clips…
Today: Google IO Connect In Miami
Kalev will be at Google's I/O Connect event in Miami today – reach out if you're in town!
FiveThirtyEight: The Rise, Fall And Potential Resurrection Of Ron DeSantis
FiveThirtyEight explores DeSantis' campaign trajectory, including an analysis of media coverage. Read The Full Article.
Visual Explorer: New Broadcast Time Offset Referencing
Last week we introduced a new parameter to the Visual Explorer that allows referencing a specific thumbnail frame in a…