Continue Reading

Generative AI Experiments: GenAI Coding Copilots: More Networking Code Troubles

As we continue to evaluate the capabilities of advanced Generative AI coding copilots, we find that they offer reasonable performance…

Continue Reading

Larger Generative AI Models Don't Equal Better Results: Disease Outbreak Codification With Bison, Unicorn, Gemini, GPT 3.5 & GPT 4.0

This past February we explored the use of LLMs for extracting and codifying disease outbreaks, finding potential but significant limitations…

Continue Reading

More Experiments In Automated Diplomacy: Countering Bad News About Ukraine Part 2

Yesterday we explored the concept of fully automated counter-messaging powered by LLMs, using ChatGPT to write an automated rebuttal to…

Continue Reading

More Experiments In Automated Diplomacy: Countering Bad News About Ukraine

Earlier this year we explored the concept of fully autonomous diplomacy: using LLMs to autonomously monitor media coverage of a…

Continue Reading

What Do The Major Generative Search Engines Say About Hamas Attacking Israel & Israel's Reaction?

Given that the general public is increasingly turning to LLMs and generative search engines like ChatGPT, Bing and Bard for…

Continue Reading

How LLM Guardrails Defeat RAG-Based Generative Search Applications For Global Events

Retrieval Augmented Generation (RAG) has become perhaps the dominate mechanism through which to expand and update the memories of LLMs….

Continue Reading

Summarizing Global Reaction To Gaza With DOC 2.0 API + LLM: Major News Story Headline Summarization

Earlier this year we explored how the DOC 2.0 API could be used to search for coverage of a specific…

Continue Reading

Gender & Race Bias In Multimodal Image Embedding Models: A Case Study Of OpenAI's CLIP & How White Men Are Doctors & CEOs Not Black Women

Over the past few months we've documented innate existential racial and gender bias in many of the textual embedding models…

Continue Reading

The Perils Of LLMs For Translation Tasks On Lower-Resource Languages: Estonian Noun Declension

One of the great promises of large language models (LLMs) is their ability to revolutionize translation and linguistic tasks. A…

Continue Reading

More Experiments With DOC 2.0 API + LLMs = Summarizing Headlines: Turkish Investment, Inflation & The Niger Coup

Last month we explored using ChatGPT (GPT-3.5) to summarize search results from the DOC 2.0 API for at-scale landscape summarization…

Continue Reading

More Experiments With LLM Translation: GCP's PaLM On Government Translation

With the General Availability of Chinese language support in Google's PaLM LLM, let's repeat our earlier tests of LLM-based translation…

Continue Reading

Experiments With Meta's SeamlessM4T Open Machine Translation Model: Social Media Posts

Continuing our series of evaluating Meta's new SeamlessM4T multimodal translation model. Let's try translating some Weibo social media posts that…

Continue Reading

DOC 2.0 API + LLMs = Summarizing Headlines: Turkish Investment, Inflation & The Niger Coup

The DOC 2.0 API allows rich keyword search over English machine translations of GDELT's online news monitoring in 65 languages….

Continue Reading

Understanding Hallucination In LLMs: A Brief Introduction

There is a lot of confusion and misunderstanding about what "hallucination" is in large language models (LLMs), how it can…

Continue Reading

Experiments In Meme Tracking: Cataloging Stories According To UN SDGs

Continuing our meme tracking series, let's look at how LLMs can be used to catalog stories according to their relevance…

Continue Reading

Experiments In Meme Tracking: Western Values In A Globalized World, Context & The Problem Of Naive Guardrails

Continuing in our meme tracking series, let's explore using LLMs to catalog stories according to their expression of common forms…

Continue Reading

Experiments In Meme Tracking: Cataloging & Classifying Memes By Conflict Enhancing Bias

Long before conflict becomes kinetic, narratives drive divisions within and between societies. Some narratives, especially those revolving around gender, racial,…

Continue Reading

Experiments In Meme Tracking: Summarization Stability & Plagiarization

Continuing our meme tracking series, let's take a closer look at the general task of summarization/distillation for English language online…

Continue Reading

Debiasing Semantic & Generative Search Results: New Risks For Companies

For more than six decades digital search has been based on the humble keyword. A search of a document database…

Continue Reading

Just Who Is A CEO & How Do We Define Gender & Racial Bias In LLM Embedding Models?

Last week we explored just how devastatingly innate gender and racial bias is in both LLM generative models and LLM-based…

Continue Reading

LLM Infinite Loops & Failure Modes: The Current State Of LLM Entity Extraction

Yesterday we demonstrated how when using LLMs for entity extraction, the addition of a single apostrophe to a source text…

Continue Reading

Authoritative Human Vs NMT/LLM Translation & Embedding-Based Quality Rankings: NMT Skew

One of the more intriguing findings from our NMT vs LLM translation experiments has been the degree to which NMT…

Continue Reading

LLM Translation Instability & Embedding-Based Ranking Of LLM & NMT Machine Translation: Part 2

Following in our series of LLM translation experiments and embedding-based quality rankings, let's look at another example of a Chinese…

Continue Reading

Using Embedding Models To Rank LLM & NMT Machine Translations Of Chinese News & Social Posts By Quality

Last week we continued our explorations of LLMs as replacements for classical NMT for machine translation. A key challenge of…