Continue Reading

More Experiments With DOC 2.0 API + LLMs = Summarizing Headlines: Turkish Investment, Inflation & The Niger Coup

Last month we explored using ChatGPT (GPT-3.5) to summarize search results from the DOC 2.0 API for at-scale landscape summarization…

Continue Reading

More Experiments With LLM Translation: GCP's PaLM On Government Translation

With the General Availability of Chinese language support in Google's PaLM LLM, let's repeat our earlier tests of LLM-based translation…

Continue Reading

Experiments With Meta's SeamlessM4T Open Machine Translation Model: Social Media Posts

Continuing our series of evaluating Meta's new SeamlessM4T multimodal translation model. Let's try translating some Weibo social media posts that…

Continue Reading

DOC 2.0 API + LLMs = Summarizing Headlines: Turkish Investment, Inflation & The Niger Coup

The DOC 2.0 API allows rich keyword search over English machine translations of GDELT's online news monitoring in 65 languages….

Continue Reading

Understanding Hallucination In LLMs: A Brief Introduction

There is a lot of confusion and misunderstanding about what "hallucination" is in large language models (LLMs), how it can…

Continue Reading

Experiments In Meme Tracking: Cataloging Stories According To UN SDGs

Continuing our meme tracking series, let's look at how LLMs can be used to catalog stories according to their relevance…

Continue Reading

Experiments In Meme Tracking: Western Values In A Globalized World, Context & The Problem Of Naive Guardrails

Continuing in our meme tracking series, let's explore using LLMs to catalog stories according to their expression of common forms…

Continue Reading

Experiments In Meme Tracking: Cataloging & Classifying Memes By Conflict Enhancing Bias

Long before conflict becomes kinetic, narratives drive divisions within and between societies. Some narratives, especially those revolving around gender, racial,…

Continue Reading

Experiments In Meme Tracking: Summarization Stability & Plagiarization

Continuing our meme tracking series, let's take a closer look at the general task of summarization/distillation for English language online…

Continue Reading

Debiasing Semantic & Generative Search Results: New Risks For Companies

For more than six decades digital search has been based on the humble keyword. A search of a document database…

Continue Reading

Just Who Is A CEO & How Do We Define Gender & Racial Bias In LLM Embedding Models?

Last week we explored just how devastatingly innate gender and racial bias is in both LLM generative models and LLM-based…

Continue Reading

LLM Infinite Loops & Failure Modes: The Current State Of LLM Entity Extraction

Yesterday we demonstrated how when using LLMs for entity extraction, the addition of a single apostrophe to a source text…

Continue Reading

Authoritative Human Vs NMT/LLM Translation & Embedding-Based Quality Rankings: NMT Skew

One of the more intriguing findings from our NMT vs LLM translation experiments has been the degree to which NMT…

Continue Reading

LLM Translation Instability & Embedding-Based Ranking Of LLM & NMT Machine Translation: Part 2

Following in our series of LLM translation experiments and embedding-based quality rankings, let's look at another example of a Chinese…

Continue Reading

Using Embedding Models To Rank LLM & NMT Machine Translations Of Chinese News & Social Posts By Quality

Last week we continued our explorations of LLMs as replacements for classical NMT for machine translation. A key challenge of…

Continue Reading

What Universal Automated Adversarial LLM Attacks Tell Us About The Nature Of LLMs & Their Development

A fascinating new paper garnered attention last week by showing how fully automated approaches can be used to harvest adversarial…

Continue Reading

No Vector Databases & Embeddings Don't Mitigate LLM Hallucination

As the enterprise world has begun to aggressively adopt LLMs into real-world workflows, the once-obscure and quickly dismissed challenge of…

Continue Reading

Large-Token LLMs: Leveraging Anthropic's Claude 2's 100K Token Limit To Summarize An Entire Episode Of Russia's 60 Minutes

Earlier today we demonstrated the extraordinary power of Anthropic's Claude 2's 100,000 token limit to summarize and topically annotate into…

Continue Reading

Large-Token LLMs: Leveraging Anthropic's Claude 2's 100K Token Limit To Summarize An Entire ABC Evening News TV News Broadcast In A Single Prompt

One of the greatest limitations of current LLMs, besides hallucination and instability, is their severe input limits: most common production…

Continue Reading

Using Embeddings To Rank Clinical, "Creative" & "Inspired Fiction" Summaries Of An Evening News Broadcast

Earlier this month we demonstrated the use of embeddings to combat hallucination in LLM summarization and as a form of…

Continue Reading

Complex & Ambiguous Phrasing, Re-Reading & LLMs' Inability To Conceptually Reason

Conceptually, LLM reasoning is a linear process, "confined to token-level, left-to-right decision-making processes during inference." This means that when confronted…

Continue Reading

When Light Is Dark, Heavy Is Light & Expensive Is Cheap: The Challenges Of Generative Search & LLM Reasoning

As we continue to assess the state of generative search, here are a few interesting examples from a major commercial…

Continue Reading

Using Embedding Ranking To Sort Generative Summaries By "Quality"

Earlier today we demonstrated how scoring LLM-based generative summaries by embedding-based similarity ranking to the original source material can filter…

Continue Reading

Do LLM's Truly "Create" Or Merely "Arrange": Just How Much Of An LLM's Writing Is Original?

Continuing our exploration of generative plagiarism, just how much of the text created by a generative LLM is truly novel?…