Wikidata, Aging Knowledge Graphs And Expired & Conflicting Attributes

As we noted this past June, knowledge graphs constructed over the real world age rapidly and require constant updating to transition expired entries and add new ones for emerging entities. In the case of the Google Knowledge Graph this can be seen in the transition from Freebase MID codes to Google's GID codes as new entities enter the global conversation. Even community-constructed knowledge graphs with large user communities constantly updating them can exhibit the problem of expired attributes even for their highest-profile entries.

Take the Wikidata entry for Donald Trump. Among the Arabic-language aliases listed is "الرئيس الأمريكي". However, this is not the Arabic transcription of Trump's name. It actually translates as "American president," meaning it is a generic signifier of the US presidency, rather than a specific reference to Trump himself. At the same time, Joe Biden's Wikidata entry does not contain this alias, despite him now being the current US president.

Thus, a system that is attempting to reason over Arabic language news coverage and which incorporates Wikidata's aliases table will see that mentions of the "American president" refer to Donald Trump.

Strangely, this phrase does not appear in the Arabic labels or aliases for the "President of the United States" Wikidata entry and instead appears as the primary Arabic label for the 1995 film "The American President" by Rob Reiner.

Thus, even the highest-profile entries in the most actively-maintained knowledge graphs can contain expired and conflicting information that requires careful analysis and consideration when using to reason over the world.