Continue Reading

What Universal Automated Adversarial LLM Attacks Tell Us About The Nature Of LLMs & Their Development

A fascinating new paper garnered attention last week by showing how fully automated approaches can be used to harvest adversarial…

Continue Reading

No Vector Databases & Embeddings Don't Mitigate LLM Hallucination

As the enterprise world has begun to aggressively adopt LLMs into real-world workflows, the once-obscure and quickly dismissed challenge of…

Continue Reading

Large-Token LLMs: Leveraging Anthropic's Claude 2's 100K Token Limit To Summarize An Entire Episode Of Russia's 60 Minutes

Earlier today we demonstrated the extraordinary power of Anthropic's Claude 2's 100,000 token limit to summarize and topically annotate into…

Continue Reading

Large-Token LLMs: Leveraging Anthropic's Claude 2's 100K Token Limit To Summarize An Entire ABC Evening News TV News Broadcast In A Single Prompt

One of the greatest limitations of current LLMs, besides hallucination and instability, is their severe input limits: most common production…

Continue Reading

Using Embeddings To Rank Clinical, "Creative" & "Inspired Fiction" Summaries Of An Evening News Broadcast

Earlier this month we demonstrated the use of embeddings to combat hallucination in LLM summarization and as a form of…

Continue Reading

Complex & Ambiguous Phrasing, Re-Reading & LLMs' Inability To Conceptually Reason

Conceptually, LLM reasoning is a linear process, "confined to token-level, left-to-right decision-making processes during inference." This means that when confronted…

Continue Reading

When Light Is Dark, Heavy Is Light & Expensive Is Cheap: The Challenges Of Generative Search & LLM Reasoning

As we continue to assess the state of generative search, here are a few interesting examples from a major commercial…

Continue Reading

Using Embedding Ranking To Sort Generative Summaries By "Quality"

Earlier today we demonstrated how scoring LLM-based generative summaries by embedding-based similarity ranking to the original source material can filter…

Continue Reading

Do LLM's Truly "Create" Or Merely "Arrange": Just How Much Of An LLM's Writing Is Original?

Continuing our exploration of generative plagiarism, just how much of the text created by a generative LLM is truly novel?…

Continue Reading

Using Embedding Ranking To Combat LLM Hallucination In Generative Summarization: The ABC News Chinese Spy Balloon Story

One of the great challenges in using Large Language Models (LLMs) for summarization is their tendency towards confident hallucination, in…

Continue Reading

Is "Generative Search" Actually "Plagiarism Search" Or Have Humans Already Written Every Possible Sentence?

Generative search is increasingly being portrayed as the future of how we search the web, in which embeddings and other…

Continue Reading

Enriching Democracy: Connecting Our Nation's Legislation To The Legislative Process Via Deep Linking CSPAN

In collaboration with the Internet Archive's TV News Archive, over the past few months we have explored the concept of…

Continue Reading

Entity Extraction: LLMs Versus Classical Neural Model + Live-Updating Knowledge Graph

Large language models are increasingly being positioned as a wholesale replacement for nearly all language analysis tasks, from Q&A and…

Continue Reading

Generative AI: Using LLM's To Produce Audience-Tailored Translations In Place Of Classical NMT's One-Size-Fits-All

Last month we explored the ability of Large Language Models (LLMs) to produce higher-quality translations than traditional Neural Machine Translation…

Continue Reading

Generative AI: Using LLMs To Produce Culturally Recent Translations Vs Classical NMT – "Dropping" A Song

Last month we explored the ability of Large Language Models (LLMs) to produce higher-quality translations than traditional Neural Machine Translation…

Continue Reading

Embedding Models: Multilingual Embedding Versus Machine Translation + English Embedding

There are at least 7,000 languages actively spoken today across the world, yet much of the focus of embedding models…

Continue Reading

Generative AI: Translation APIs Versus LLM For Social Media Translation

Large Language Models can perform a wide array of tasks, including textual translation. Given the widespread availability of existing dedicated…

Continue Reading

The Language Bias Of Large Language Models: Why Myanmar Is 1300% More Costly Than English In GPT-3

Large Language Models (LLM's) like OpenAI's ChatGPT and Google's Bard interpret language not as discrete words, but as word-parts known…

Continue Reading

Hyperlinking Television: Connecting Our Nation's Legislation To The Legislative Process Via Deep Linking CSPAN

In collaboration with the Internet Archive's TV News Archive and the Media-Data Research Consortium, we are tremendously excited today to…

Continue Reading

Visual Explorer: Introducing Visual Explorer Lenses

Today we are excited to unveil Visual Explorer Lenses: a powerful new metaphor for engaging with and understanding video that…

Continue Reading

Brookings: Television As Data: Opening The Internet Archive’s Two Decade Archive Of Global Television News Spanning 50 Countries To Journalists & Scholars Through AI, Analytics, Search & Visualization

Kalev spoke at the Brookings Institution this afternoon, surveying GDELT's various datasets and especially its new collaborations with the Internet…

Continue Reading

Large Language Models (LLMs) + Planetary-Scale Realtime Data: Current Limitations

While impressive, there are still numerous challenges with current state of the art Large Language Models (LLMs) when applied to…