Continuing our meme tracking series, we've seen that LLMs provide subpar entity extraction compared with traditional neural entity extractors. On the other hand, one possible use case where they could offer value is the ability to extract context surrounding those entity mentions: titles and affiliations for people and connections between entities. What might this look like?
The end result is that LLMs are quite good at extracting and normalizing titles and affiliations … when they work. As the results below show, they can connect information across an article to derive the proper affiliation for a person, but at the same time they can become trivially confused and alternate between sophisticated parsing and linking for some names and copy-pasting entire sentences for others. Despite being instructed explicitly to remain solely within the confines of the article text, they frequently reach outside of the text to draw upon their original training data, meaning they ascribe information to the article that may be out of date. Critically, their most detailed and responsive output, upon closer inspection, is drawn not from the article text itself, but from regurgitated training data memorized by the model, offering a cautionary reminder that often the best results from LLMs are not all that they appear. Finally, the results differ dramatically across runs, meaning it is completely random chance as to whether results look reasonable and usable or completely useless. LLM infinite loops are also encountered once again, suggesting they are far more common than believed.
Let's examine this Covid-19 vaccine article. Let's use the prompt "Make a list of all of the people and organizations mentioned in this article. For each person list their title and affiliation mentioned in the article." and a temperature of 0.0. On the one hand, it does a good job of taking mentions like "Kaiser Family Foundation Director of Survey Methodology Ashley Kirzinger" and parsing them, but it also misses organizations like the CDC, failing to extract it as its own entry. It connects the Reuters byline at the top of the article and the credit statement at the bottom of "Reporting by Michael Erman; editing by Caroline Humer and Diane Craft" to correctly extract those three names and associate them with Reuters and their roles of reporter and editors. Unfortunately, it isn't able to distinguish the Reuters company codes from the companies themselves, extracting "BioNTech SE (22UAy.DE)" as a single entity, making it impossible to connect that mention to coverage from other sources that use their own codes. Similarly, for Schaffner, it simply extracts the complete sentence, rather than reformatting it as it did for Kirzinger.
– Michael Erman, Reuters reporter
– Caroline Humer, Reuters editor
– Diane Craft, Reuters editor
– Ashley Kirzinger, Director of Survey Methodology, Kaiser Family Foundation
– Pfizer, largest maker of mRNA shots with BioNTech
– Moderna, Pfizer's biggest rival
– Jefferies analyst Michael Yee
– CDC Director Mandy Cohen
– Pfizer (PFE.N)/BioNTech SE (22UAy.DE)
– Moderna (MRNA.O)
– Novavax (NVAX.O)
– Dr. William Schaffner, infectious diseases specialist at Vanderbilt University and a liaison to the CDC's Advisory Committee on Immunization practices
– Dr. David Boulware, infectious diseases specialist at the University of Minnesota
What about a temperature of 0.2?
– Michael Erman, Reuters reporter
– Caroline Humer, Reuters editor
– Diane Craft, Reuters editor
– Ashley Kirzinger, Director of Survey Methodology at the Kaiser Family Foundation
– Pfizer, largest maker of mRNA shots with BioNTech
– Moderna, Pfizer's biggest rival
– Jefferies analyst Michael Yee
– CDC Director Mandy Cohen
– Pfizer (PFE.N)/BioNTech SE (22UAy.DE)
– Moderna (MRNA.O)
– Novavax (NVAX.O)
– Dr. William Schaffner, an infectious diseases specialist at Vanderbilt University and a liaison to the CDC's Advisory Committee on Immunization practices
– Dr. David Boulware, an infectious diseases specialist at the University of Minnesota
Can we use the same model to extract relationships? Let's try the prompt below as a way of getting an LLM to extract SVO relationships with a temperature of 0.0:
List the connections and relationships mentioned in the article in the format \\"Name -> Relationship -> Name\\". Example: Pfizer announced a new vaccine today. Result: \\"Pfizer -> announced -> vaccine\\".
This yields the following which is eventually truncated when we hit the output token cap. With our 0.0 temperature we can see that these results aren't terribly useful. Entries 2-6 come from the sentence "The CDC recommended children get a single dose of last year's updated vaccine for those aged 6 and older." where it simply blindly connected the word "recommended" to "children" etc. It also conflates "vaccine" and "last year's updated vaccine" as separate entities. Overall, not terribly useful and worse than we get through a traditional dependency parse.
CDC -> director -> Mandy Cohen CDC -> recommended -> vaccine CDC -> recommended -> children CDC -> recommended -> single dose CDC -> recommended -> last year's updated vaccine CDC -> recommended -> those aged 6 and older CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC -> ended -> May CDC -> handed -> much of the duty of vaccinating America to the private sector CDC -> public health emergency -> ended in May CDC -> data -> COVID-19 related hospitalizations CDC -> data -> more than 40% off of recent lows hit in June CDC -> data -> still more than 90% below peak levels hit during the January 2022 Omicron outbreak CDC -> data -> over 1.1 million people in the United States have died from COVID CDC
What if we keep the same prompt and boost our temperature to 0.2 to give it more flexibility? Here we get far more interesting results, though we still get the overextracted "recommendation" clause: the CDC isn't "recommending" children" and "elderly" – it is recommending that they get the vaccine.
CDC -> director -> Mandy Cohen Mandy Cohen -> organization -> CDC CDC -> recommendation -> vaccine Pfizer -> product -> vaccine Pfizer -> revenue -> $56 billion Pfizer -> competitor -> Moderna Moderna -> product -> vaccine Moderna -> revenue -> $20 billion Pfizer -> partner -> BioNTech BioNTech -> product -> vaccine Novavax -> product -> vaccine Novavax -> revenue -> $20 billion Novavax -> competitor -> Pfizer Novavax -> competitor -> Moderna Novavax -> partner -> Serum Institute of India CDC -> recommendation -> booster CDC -> recommendation -> vaccine CDC -> recommendation -> children CDC -> recommendation -> adults CDC -> recommendation -> elderly CDC -> recommendation -> high risk people CDC -> recommendation -> younger people
Unfortunately, running a second time we're back to our overeager results from before that aren't terribly useful, reminding us how brittle and unstable LLMs are.
CDC -> director -> Mandy Cohen Mandy Cohen -> organization -> U.S. Centers for Disease Control and Prevention CDC -> recommendation -> COVID-19 vaccine CDC -> recommendation -> booster CDC -> recommendation -> annual CDC -> recommendation -> flu shot CDC -> recommendation -> Omicron CDC -> recommendation -> XBB.1.5 CDC -> recommendation -> EG.5 CDC -> recommendation -> sub-lineage CDC -> recommendation -> still dominant Omicron variant CDC -> recommendation -> COVID-19 related hospitalizations CDC -> recommendation -> more than 40% CDC -> recommendation -> recent lows CDC -> recommendation -> hit in June CDC -> recommendation -> more than 90% CDC -> recommendation -> peak levels CDC -> recommendation -> hit during the January 2022 Omicron outbreak CDC -> recommendation -> COVID-19 CDC -> recommendation -> elderly CDC -> recommendation -> high risk people CDC -> recommendation -> dire outcomes CDC -> recommendation -> catch COVID-19 CDC -> recommendation -> ACIP CDC -> recommendation -> weaker recommendation CDC -> recommendation -> younger CDC -> recommendation -> healthier people CDC -> recommendation -> vaccine CDC -> recommendation -> targeted vaccine CDC -> recommendation -> children CDC -> recommendation -> single dose CDC -> recommendation -> last year's updated vaccine CDC -> recommendation -> those aged 6 and older CDC -> recommendation -> Dr. David Boulware CDC -> recommendation -> research CDC -> recommendation -> published CDC -> recommendation -> people CDC -> recommendation -> boosted CDC -> recommendation -> less severe symptoms CDC -> recommendation -> shorter duration CDC -> recommendation -> duration of illness CDC -> recommendation -> get sick CDC -> recommendation -> being boosted CDC -> recommendation -> best way CDC -> recommendation -> do that
What about limiting ourselves to specific kinds of relationships? We use this prompt with a temperature 0f 0.0:
List all of the recommendations, partnerships, products and revenue mentioned in the article in the format \\"Name -> Relationship -> Name\\". Example: Pfizer announced a new vaccine today. Result: \\"Pfizer -> announced -> vaccine\\".
Unfortunately we hit another infinite loop before we hit our output cap. We can also see how it violates our constraints by expanding to "cut jobs":
Pfizer -> makes -> vaccine Pfizer -> revenue -> $20 billion Pfizer -> partnered with -> BioNTech Pfizer -> recommended by -> CDC Pfizer -> sold -> 50 million shots Pfizer -> cut jobs -> might need to Pfizer -> updated -> vaccine Pfizer -> advised -> annual shots should be targeted at the elderly and other high risk people Pfizer -> created -> versions of the COVID vaccine Pfizer -> aimed at -> XBB.1.5 Pfizer -> advised -> children get a single dose of last year's updated vaccine for those aged 6 and older Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> annual shots should be rolled out in the third or fourth week of September Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots as an annual measure to protect oneself, in line with the annual flu shot Pfizer -> advised -> Americans should view these shots
With a temperature of 0.2:
Pfizer -> makes -> COVID vaccine Pfizer -> revenue -> $20 billion Pfizer -> partnered with -> BioNTech Pfizer -> recommended by -> CDC Pfizer -> sold -> 50 million shots Pfizer -> expected to cut jobs Pfizer -> updated -> COVID vaccine Pfizer -> advised -> annual shots should be targeted at the elderly and other high risk people Pfizer -> recommended -> children get a single dose of last year's updated vaccine Pfizer -> advised -> people who are boosted have less severe symptoms for a shorter duration Moderna -> makes -> COVID vaccine Moderna -> revenue -> $20 billion Moderna -> expected to sell -> 50 million shots Moderna -> partnered with -> BioNTech Moderna -> recommended by -> CDC Moderna -> updated -> COVID vaccine Moderna -> advised -> annual shots should be targeted at the elderly and other high risk people Moderna -> recommended -> children get a single dose of last year's updated vaccine Moderna -> advised -> people who are boosted have less severe symptoms for a shorter duration CDC -> recommended -> COVID vaccine CDC -> recommended -> annual shots should be targeted at the elderly and other high risk people CDC -> recommended -> children get a single dose of last year's updated vaccine CDC -> advised -> people who are boosted have less severe symptoms for a shorter duration CVS Health -> offers -> COVID vaccine
And a second time we're missing the Moderna entries:
Pfizer -> makes -> vaccine Pfizer -> revenue -> $20 billion Pfizer -> partnered with -> BioNTech Pfizer -> recommended by -> CDC Pfizer -> advised by -> Dr. William Schaffner Pfizer -> competitor -> Moderna Pfizer -> sales -> $56 billion Pfizer -> updated -> vaccine Pfizer -> target -> elderly and other high risk people Pfizer -> advised by -> Dr. David Boulware CVS Health -> offers -> vaccine CDC -> recommended -> vaccine CDC -> advised by -> Dr. William Schaffner CDC -> target -> children CDC -> advised by -> Dr. David Boulware
And a third time we get similar results, but missing some entries:
Pfizer -> revenue -> $20 billion Pfizer -> product -> COVID vaccine Pfizer -> partner -> BioNTech Pfizer -> recommendation -> annual shots should be targeted at the elderly and other high risk people Pfizer -> product -> COVID vaccine Pfizer -> revenue -> $56 billion Pfizer -> product -> COVID vaccine Pfizer -> partner -> Moderna CVS Health -> product -> COVID vaccine CDC -> recommendation -> annual shots should be viewed as an annual measure to protect oneself, in line with the annual flu shot CDC -> product -> COVID vaccine CDC -> partner -> Pfizer/BioNTech SE CDC -> partner -> Moderna CDC -> partner -> Novavax
And a fourth time Moderna is back, reminding us of the instability of LLMs:
Pfizer -> makes -> vaccine Pfizer -> revenue -> $20 billion Pfizer -> partnered with -> BioNTech Pfizer -> recommended by -> CDC Pfizer -> advised by -> Dr. William Schaffner Pfizer -> competitor -> Moderna Moderna -> revenue -> $20 billion Moderna -> recommended by -> CDC Moderna -> advised by -> Dr. William Schaffner Moderna -> competitor -> Pfizer CDC -> recommended -> Pfizer CDC -> recommended -> Moderna CDC -> advised by -> Dr. William Schaffner CDC -> advised by -> Dr. David Boulware CVS Health -> offers -> vaccine
What if we drop our example and use the following prompt with a temperature of 0.2?
List all of the recommendations, partnerships, products and revenue mentioned in the article in the format \\"Name -> Relationship -> Name\\".
This yields the following, which doesn't follow our prompt instructions but is a good output:
Pfizer -> Revenue: $56 billion worldwide Pfizer -> Partner: BioNTech Pfizer -> Product: COVID-19 vaccine Moderna -> Revenue: $20 billion Moderna -> Product: COVID-19 vaccine CDC -> Recommendation: COVID-19 vaccine CVS Health -> Product: COVID-19 vaccine
Running a second time we get similar results, though the addition of CDC as a creator of the vaccine and additional recommendation entries:
Pfizer -> Revenue: $56 billion worldwide Pfizer -> Partner: BioNTech Pfizer -> Product: COVID-19 vaccine Pfizer -> Recommendation: Annual shots should be targeted at the elderly and other high risk people Moderna -> Revenue: $20 billion Moderna -> Product: COVID-19 vaccine Moderna -> Recommendation: Annual shots should be targeted at the elderly and other high risk people CDC -> Recommendation: Annual shots should be targeted at the elderly and other high risk people CDC -> Partner: Pfizer/BioNTech CDC -> Partner: Moderna CDC -> Partner: Novavax CDC -> Product: COVID-19 vaccine
What about this same prompt with a temperature of 0.2 on this unrelated Ukraine article to test what happens when it encounters an article that doesn't contain such relationships?
List all of the recommendations, partnerships, products and revenue mentioned in the article in the format \\"Name -> Relationship -> Name\\".
This produces the following:
- Ukraine -> Conflict -> Russia - Ukraine -> Location -> Crimea - Ukraine -> Location -> Luhansk - Ukraine -> Location -> Donetsk - Ukraine -> Location -> Zaporizhzhia - Ukraine -> Location -> Kherson - Ukraine -> Location -> Bakhmut - Ukraine -> Location -> Staromaiorske - Ukraine -> Location -> Urozhaine - Ukraine -> Location -> Kupyansk - Ukraine -> Location -> Lyman - Ukraine -> Adversary -> Russia
A second time we get just a single entry:
- Ukraine -> Conflict -> Russia
A third time we get more of the original list with additions:
- Ukraine -> Conflict -> Russia - Ukraine -> Location -> Kyiv - Ukraine -> Location -> Donetsk - Ukraine -> Location -> Zaporizhzhia - Ukraine -> Location -> Kherson - Ukraine -> Location -> Crimea - Ukraine -> Location -> Donbas - Ukraine -> Location -> Bakhmut - Ukraine -> Location -> Staromaiorske - Ukraine -> Location -> Urozhaine - Ukraine -> Location -> Kupyansk - Ukraine -> Location -> Lyman - Ukraine -> Military -> Ukrainian military - Ukraine -> Military -> Russian military - Ukraine -> Government -> Deputy Defence Minister Hanna Maliar
Let's try with a temperature of 0.2 again:
List all of the relationships and connections mentioned in the article in the format \\"Name -> Relationship/Connection -> Name\\".
We get an infinite loop our first run:
Ukraine -> COUNTRY Ukraine -> REGION Donetsk Ukraine -> REGION Luhansk Ukraine -> REGION Zaporizhzhia Ukraine -> REGION Kherson Ukraine -> REGION Donbas Ukraine -> CONFLICT Russia-Ukraine War Ukraine -> LEADER Volodymyr Zelenskyy Ukraine -> LEADER Oleksiy Reznikov Ukraine -> LEADER Hanna Maliar Ukraine -> LEADER Dmytro Kuleba Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Denys Shmyhal Ukraine -> LEADER Iryna Vereshchuk Ukraine -> LEADER Mykhailo Podolyak Ukraine -> LEADER Serhiy Haidai Ukraine -> LEADER Oleksandr Motuzyanyk Ukraine -> LEADER Kyrylo Budanov Ukraine -> LEADER Yuriy Sak Ukraine -> LEADER Serhiy Bratchuk Ukraine -> LEADER Oleksandr Zhuravlyov Ukraine -> LEADER Oleksandr Starukh Ukraine -> LEADER Oleksandr Syenkevych Ukraine -> LEADER Oleksandr Vilkul Ukraine -> LEADER Volodymyr Matsar Ukraine -> LEADER Oleksandr Tkachenko Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -> LEADER Oleksandr Kubrakov Ukraine -> LEADER Oleksandr Novikov Ukraine -> LEADER Yuriy Borshch Ukraine -
Our second run also yields an infinite loop:
Ukraine -> COUNTRY Ukraine -> REGION Donetsk Ukraine -> REGION Luhansk Ukraine -> REGION Zaporizhzhia Ukraine -> REGION Kherson Ukraine -> REGION Donbas Ukraine -> CONFLICT Russia-Ukraine War Ukraine -> LEADER Volodymyr Zelenskyy Ukraine -> LEADER Oleksiy Reznikov Ukraine -> LEADER Hanna Maliar Ukraine -> LEADER Denis Shmygal Ukraine -> LEADER Dmytro Kuleba Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine -> LEADER Yuriy Lutsenko Ukraine -> LEADER Iryna Venediktova Ukraine -> LEADER Serhiy Prytula Ukraine -> LEADER Andriy Yermak Ukraine ->
What if we drop our formatting request and just let it decide how to output the relationships:
List all of the relationships and connections mentioned in the article.
This yields:
The article is about the war between Russia and Ukraine. The relationships and connections mentioned in the article are: - Ukraine and Russia are at war. - The war is taking place in Ukraine. - The fighting is taking place along the front line. - The Ukrainian military is trying to retake Russian-occupied territory. - The Ukrainian military has made some progress, but progress has been slow. - The Russian military is using heavy artillery and air strikes. - The Ukrainian military is trying to retake the city of Bakhmut. - The Russian military is regrouping around the towns of Kupyansk and Lyman. - Russia still controls around a fifth of Ukraine.
And a second time it is clear that this is not as useful as it results in just textual summaries rather than codified results:
The article is about the ongoing war between Russia and Ukraine. The relationships and connections mentioned in the article are: - Ukraine and Russia are at war. - Ukraine is trying to retake Russian-occupied territory. - The Ukrainian military has made incremental gains since kicking off their long-awaited operation in June. - Russian forces are continuing their assault around the eastern towns of Kupyansk and Lyman. - Russia still controls around a fifth of Ukraine.
What if we go back to our names and titles prompt and keep our 0.2 temperature?
Make a list of all of the people and organizations mentioned in this article. For each person list their title and affiliation mentioned in the article.
This yields just three entries. While incorrectly including country names it misses the Ukraine Armed Forces. Running repeatedly yields just the same three results.
- Hanna Maliar, Deputy Defence Minister of Ukraine - Russia - Ukraine
What about a different approach?
Make a bulleted list of the events discussed in the following article.
This yields a more generic summary:
– Ukrainian forces have made incremental gains since kicking off their long-awaited operation in June, but officials in Kyiv have acknowledged that progress has been slower than they would have liked and hampered by rigid Russian defences.
– The Armed Forces are facing complete mining of the territory, cement fortifications of the key heights, (and) constant mortar and artillery shelling," she said. "Additionally the Russians are densely using aviation."
– Ukrainian forces managed to retake nearly 2 square miles during the past week around the ruined eastern city of Bakhmut, where Russian and Ukrainian forces fought the bloodiest battle of the nearly 18-month war.
– Russian troops were continuing their assault around the eastern towns of Kupyansk and Lyman, where she said they were regrouping.
– Russia still controls around a fifth of Ukraine, including the Black Sea peninsula of Crimea, Luhansk region in the east and swathes of the regions of Donetsk, Zaporizhzhia and Kherson. The occupied territory includes most of Ukraine's coast line and parts of its industrial heartland, the Donbas.
What about this prompt?
Construct a graph database representing all of the information in the following article in a network diagram:
Again, more of a generic summary:
The article is about the war between Russia and Ukraine. It reports that Ukrainian forces are trying to retake Russian-occupied territory in the southeast. The Ukrainian military has made some progress, but they are facing stiff resistance from Russian forces. The article also reports that Russian troops are continuing their assault around the eastern towns of Kupyansk and Lyman. The graph database can be represented as follows: - Ukraine - Military - Made some progress in pushing back Moscow's troops - Troops fighting for control of the neighbouring settlement of Urozhaine - Retook nearly 2 square miles during the past week around the ruined eastern city of Bakhmut - Troops fighting along its entire front line - Russia - Troops continuing their assault around the eastern towns of Kupyansk and Lyman - Controls around a fifth of Ukraine, including the Black Sea peninsula of Crimea, Luhansk region in the east and swathes of the regions of Donetsk, Zaporizhzhia and Kherson
What about this?
Construct a graph database representation of all of the connections and relationships among the people, locations and organizations discussed in the following text:
Again a generic summary:
The article is about the war in Ukraine. It reports on the progress of the Ukrainian military in pushing back Russian troops. The article mentions the following people: - Hanna Maliar: Deputy Defence Minister of Ukraine - Dan Peleschuk: A reporter for Reuters - Angus MacSwan: A reporter for Reuters The article also mentions the following locations: - Ukraine - Russia - Crimea - Luhansk region - Donetsk - Zaporizhzhia - Kherson The article also mentions the following organizations: - Ukrainian military - Russian military - Reuters We can construct a graph database representation of all of the connections and relationships among the people, locations and organizations discussed in the following text: ``` (Hanna Maliar, Deputy Defence Minister of Ukraine) (Dan Peleschuk, A reporter for Reuters) (Angus MacSwan, A reporter for Reuters) (Ukraine, Russia) (Crimea, Luhansk region, Donetsk, Zaporizhzhia, Kherson) (Ukrainian military, Russian military) (Reuters) ```
What if we provide a constraining example format:
Construct a graph database representation of all of the connections and relationships among the people, locations and organizations discussed in the following text. Use the format \\"Object 1 -> Relationship -> Object 2\\".
Yielding the less-than-ideal results, including that Bakhmut, Crimea, etc are the titles the elected leader of Ukraine holds, rather than location names:
Ukraine COUNTRY_OF_ORIGIN Ukraine Ukraine LEADER_NAME Volodymyr Zelenskyy Ukraine CONFLICT_PARTICIPANT Russia Ukraine CONFLICT_PARTICIPANT Ukraine Ukraine LEADER_TITLE President of Ukraine Ukraine LEADER_NAME Hanna Maliar Ukraine LEADER_TITLE Deputy Defence Minister Ukraine LEADER_TITLE Moscow-installed official Ukraine LEADER_NAME Kyiv Ukraine LEADER_TITLE Russian Ukraine LEADER_NAME Donetsk Ukraine LEADER_TITLE Russian-held Ukraine LEADER_TITLE Bakhmut Ukraine LEADER_TITLE Kupyansk Ukraine LEADER_TITLE Lyman Ukraine LEADER_TITLE Staromaiorske Ukraine LEADER_TITLE Urozhaine Ukraine LEADER_TITLE Crimea Ukraine LEADER_TITLE Luhansk Ukraine LEADER_TITLE Donetsk Ukraine LEADER_TITLE Zaporizhzhia Ukraine LEADER_TITLE Kherson Ukraine LEADER_TITLE Donbas Ukraine LEADER_TITLE Black Sea
Running again we get far less output. Unfortunately we can see that Bison has gone beyond the constraints of the article to include Putin and Zelenskyy who are not mentioned:
Ukraine COUNTRY_OF_RESIDENCE Hanna Maliar Hanna Maliar OCCUPATION Deputy Defence Minister of Ukraine Ukraine CONFLICT Russia-Ukraine War Russia OPPONENT_IN_CONFLICT Ukraine Russia COUNTRY_OF_RESIDENCE Vladimir Putin Vladimir Putin OCCUPATION President of Russia Ukraine LEADER_NAME Volodymyr Zelenskyy Volodymyr Zelenskyy OCCUPATION President of Ukraine Ukraine LEADER_TITLE President
What about an alternative prompt where we give it maximal flexibility in deciding its output and suggest CSV, knowing that many common graph formats are CSV?
Construct a graph database representation in CSV format of all of the connections and relationships among the people, locations and organizations discussed in the following text.
This time we get some quite interesting results that at first glance appear quite extraordinary. It has codified not only the presidents of Ukraine and Russia, but of the occupied territories of Donetsk and Luhansk. It also extracts the reporter and editor names and assigns them to their respective countries. There's just one problem: none of that information is in the article. None of the four leaders are mentioned in the text and the nationalities of the two journalists are also not specified. Suddenly it is clear why these results are so much better than the others: Bison has leaned into its training data and regurgitated a structured database it was trained on, rather than codifying the article as requested.
| COUNTRY | LEADER | LEADER_TITLE | LEADER_NAME | |---|---|---|---| | Ukraine | Volodymyr Zelenskyy | President | Volodymyr Zelenskyy | | Russia | Vladimir Putin | President | Vladimir Putin | | Donetsk | Denis Pushilin | Head of the Donetsk People's Republic | Denis Pushilin | | Luhansk | Leonid Pasechnik | Head of the Luhansk People's Republic | Leonid Pasechnik | | Bakhmut | - | - | - | | Staromaiorske | - | - | - | | Urozhaine | - | - | - | | Kupyansk | - | - | - | | Lyman | - | - | - | | Crimea | - | - | - | | Zaporizhzhia | - | - | - | | Kherson | - | - | - | | Donbas | - | - | - | | LOCATION | COUNTRY | |---|---| | Bakhmut | Ukraine | | Staromaiorske | Ukraine | | Urozhaine | Ukraine | | Kupyansk | Ukraine | | Lyman | Ukraine | | Crimea | Russia | | Zaporizhzhia | Ukraine | | Kherson | Ukraine | | Donbas | Ukraine | | ORGANIZATION | LEADER | LEADER_TITLE | |---|---|---| | Ukrainian Armed Forces | Volodymyr Zelenskyy | President | | Russian Armed Forces | Vladimir Putin | President | | Donetsk People's Republic | Denis Pushilin | Head of the Donetsk People's Republic | | Luhansk People's Republic | Leonid Pasechnik | Head of the Luhansk People's Republic | | PERSON | COUNTRY | |---|---| | Volodymyr Zelenskyy | Ukraine | | Vladimir Putin | Russia | | Denis Pushilin | Russia | | Leonid Pasechnik | Russia | | Hanna Maliar | Ukraine | | Dan Peleschuk | United States | | Angus MacSwan | United Kingdom |
A second run makes this even more clear, with an even larger cast of characters not mentioned in the article:
| COUNTRY | LEADER | STATUS | |---|---|---| | Ukraine | Volodymyr Zelenskyy | President | | Russia | Vladimir Putin | President | | Donetsk | Denis Pushilin | Head of the Donetsk People's Republic | | Luhansk | Leonid Pasechnik | Head of the Luhansk People's Republic | | Zaporizhzhia | Oleksandr Starukh | Head of Zaporizhzhia Oblast | | Kherson | Hennadiy Lahuta | Head of Kherson Oblast | | Bakhmut | Oleksandr Senkevych | Mayor | | Staromaiorske | - | - | | Urozhaine | - | - | | Crimea | Sergei Aksyonov | Head of the Republic of Crimea | | Donbas | - | - | | LOCATION | COUNTRY | |---|---| | Kyiv | Ukraine | | Donetsk | Ukraine | | Luhansk | Ukraine | | Zaporizhzhia | Ukraine | | Kherson | Ukraine | | Bakhmut | Ukraine | | Staromaiorske | Ukraine | | Urozhaine | Ukraine | | Crimea | Russia | | Donbas | Ukraine | | ORGANIZATION | LEADER | STATUS | |---|---|---| | Ukrainian Armed Forces | Volodymyr Zelenskyy | President | | Russian Armed Forces | Vladimir Putin | President | | Donetsk People's Republic | Denis Pushilin | Head of the Donetsk People's Republic | | Luhansk People's Republic | Leonid Pasechnik | Head of the Luhansk People's Republic | | Zaporizhzhia Oblast | Oleksandr Starukh | Head of Zaporizhzhia Oblast | | Kherson Oblast | Hennadiy Lahuta | Head of Kherson Oblast | | Bakhmut | Oleksandr Senkevych | Mayor | | Staromaiorske | - | - | | Urozhaine | - | - | | Crimea | Sergei Aksyonov | Head of the Republic of Crimea | | Donbas | - | - | | PERSON | POSITION | ORGANIZATION | |---|---|---| | Volodymyr Zelenskyy | President | Ukraine | | Vladimir Putin | President | Russia | | Denis Pushilin | Head of the Donetsk People's Republic | Donetsk People's Republic | | Leonid Pasechnik | Head of the Luhansk People's Republic | Luhansk People's Republic | | Oleksandr Starukh | Head of Zaporizhzhia Oblast | Zaporizhzhia Oblast | | Hennadiy Lahuta | Head of Kherson Oblast | Kherson Oblast | | Oleksandr Senkevych | Mayor | Bakhmut | | Hanna Maliar | Deputy Defence Minister | Ukraine | | Dan Peleschuk | Reporter | Reuters | | Angus MacSwan | Editor | Reuters |
Let's constrain our prompt to order it to use only information in the article:
Construct a graph database representation in CSV format of all of the connections and relationships among the people, locations and organizations discussed in the following text. Only include connections, relationships, people, locations and organizations mentioned directly in the text.
Again with a temperature of 0.2. This time we eliminate all of the external information, though we have a new problem: the model lists the disputed locations as being both in Ukraine and in Russia. Though, in this particular case that might be a reasonable codification, since the article describes the locations as being claimed by and under Russian occupation, with Ukraine attempting to liberate them and reclaim them, so strictly on the text in the article, this would actually be a reasonable result, especially since Kyiv is not included in the Russian list, only the occupied territories.
| COUNTRY | REGION | LOCATION | |-------------|-------------|-------------| | Ukraine | Europe | Kyiv | | Ukraine | Europe | Donetsk | | Ukraine | Europe | Zaporizhzhia | | Ukraine | Europe | Kherson | | Ukraine | Europe | Crimea | | Ukraine | Europe | Luhansk | | Ukraine | Europe | Bakhmut | | Ukraine | Europe | Staromaiorske | | Ukraine | Europe | Urozhaine | | Ukraine | Europe | Kupyansk | | Ukraine | Europe | Lyman | | Russia | Europe | Donetsk | | Russia | Europe | Crimea | | Russia | Europe | Luhansk | | Russia | Europe | Zaporizhzhia | | Russia | Europe | Kherson | | Russia | Europe | Bakhmut | | Russia | Europe | Staromaiorske | | Russia | Europe | Urozhaine | | Russia | Europe | Kupyansk | | Russia | Europe | Lyman | | Ukraine | MILITARY | Ukrainian military | | Russia | MILITARY | Russian military | | Ukraine | GOVERNMENT | Deputy Defence Minister Hanna Maliar | | Russia | GOVERNMENT | Moscow-installed official |
What if we run a second time? This time we get a much poorer set of results and once again it violates its constraints and uses information outside the article:
| COUNTRY | LOCATION | LEADER | |---|---|---| | Ukraine | Kyiv | Volodymyr Zelenskyy | | Russia | Moscow | Vladimir Putin | | Donetsk | Donetsk | Denis Pushilin | | Zaporizhzhia | Zaporizhzhia | Oleksandr Starukh | | Kherson | Kherson | Hennadiy Lahuta | | Crimea | Simferopol | Sergei Aksyonov | | Donbas | Donetsk | | | Bakhmut | Bakhmut | | | Staromaiorske | Staromaiorske | | | Urozhaine | Urozhaine | | | Kupyansk | Kupyansk | | | Lyman | Lyman | |
A third time returns us to more reasonable results:
| LOCATION | COUNTRY | |---|---| | Bakhmut | Ukraine | | Crimea | Ukraine | | Donetsk | Ukraine | | Kherson | Ukraine | | Luhansk | Ukraine | | Staromaiorske | Ukraine | | Urozhaine | Ukraine | | Zaporizhzhia | Ukraine | | Ukraine | Ukraine | | Russia | Russia | | Russian | Russia | | Ukrainian | Ukraine | | Ukrainian Armed Forces | Ukraine | | Deputy Defence Minister Hanna Maliar | Ukraine | | Kyiv | Ukraine |
And a fourth time once again violates our constraint boundary and incorporates external information:
| COUNTRY | REGION | LOCATION | |---|---|---| | Ukraine | Eastern Europe | Kyiv | | Ukraine | Eastern Europe | Donetsk | | Ukraine | Eastern Europe | Zaporizhzhia | | Ukraine | Eastern Europe | Kherson | | Ukraine | Eastern Europe | Crimea | | Ukraine | Eastern Europe | Bakhmut | | Ukraine | Eastern Europe | Staromaiorske | | Ukraine | Eastern Europe | Urozhaine | | Ukraine | Eastern Europe | Kupyansk | | Ukraine | Eastern Europe | Lyman | | Russia | Eastern Europe | Donetsk | | Russia | Eastern Europe | Luhansk | | Russia | Eastern Europe | Crimea | | Russia | Eastern Europe | Kherson | | Russia | Eastern Europe | Kupyansk | | Russia | Eastern Europe | Lyman | | Ukraine | MILITARY | Ukrainian military | | Russia | MILITARY | Russian military | | Ukraine | GOVERNMENT | Ukrainian government | | Russia | GOVERNMENT | Russian government | | Ukraine | LEADER | Volodymyr Zelenskyy | | Russia | LEADER | Vladimir Putin |