Why conversational AIs can be factually inaccurate

Conversational AI-powered tools are going mainstream, which to many disinformation researchers, is a major cause for concern. This week, Google announced Bard, its answer to Open AI’s ChatGPT, and doubled down on rolling out AI-enhanced features to many of its core products at an event in Paris. Similarly, Microsoft announced that ChatGPT would soon be integrated with Bing, its much maligned search engine. Over the coming months these conversational tools will be widely available, but already, some problems are starting to appear.

Conversational AIs are built using a neural network framework called “large language models” (LLMs) and are incredibly good at generating text that is grammatically coherent and seems plausible and human-like. They can do this because they are trained on hundreds of gigabytes of human text, most of it scraped from the internet. To generate new text, the model will work by predicting the next “token” (basically, a word or fragment of a complex word) given a sequence of tokens (many researchers have compared this to the “fill in the blank” exercises we used to do in school).

For example, I asked ChatGPT to write about PopSci and it started by stating “Popular Science is a science and technology magazine that was first published in 1872.” Here, it’s fairly clear that it is cribbing its information from places like our About page and our Wikipedia page, and calculating what are the likely follow-on words to a sentence that starts: “Popular Science is…” The paragraph continues in much the same vein, with each sentence being the kind of thing that follows along naturally in the sorts of content that ChatGPT is trained on.

Unfortunately, this method of predicting plausible next words and sentences mean conversational AIs can frequently be factually wrong, and unless you already know the information already, you can easily be misled because they sound like they know what they’re talking about. PopSci is technically no longer a magazine, but Google demonstrated this even better with the rollout of Bard. (This is also why large language models can regurgitate conspiracy theories and other offensive content unless specifically trained not to.)

One of the demonstration questions in Google’s announcement (which is still live as of the time of writing) was “What new discoveries from the James Webb Space Telescope can I tell my 9 year old about?” In response, Bard offered three bullet points including one that said that “JWST took the very first pictures of a planet outside of our solar system.”

While that sounds like the kind of thing you’d expect the largest space telescope ever built to do—and the JWST is indeed spotting exoplanets—it didn’t find the first one. According to Reuters and NASA, that honor goes to the European Southern Observatory’s Very Large Telescope (VLT) which found one in 2004. If this had instead happened as part of someone asking Bard for advice and not as part of a very public announcement, there wouldn’t have been dozens of astronomy experts ready to step in and correct it.

Microsoft is taking a more up front approach. The Verge found that Bing’s new FAQ stated that ”the AI can make mistakes,” and that “Bing will sometimes misrepresent the information it finds, and you may see responses that sound convincing but are incomplete, inaccurate, or inappropriate.” It continues calling on users to exercise their own judgment and double-check the facts that the AI offers up. (It also says that you can ask Bing: “Where did you get that information?” to find out what sources it used to generate the answer.)

Still, this feels like a bit of a cop out from Microsoft. Yes, people should be skeptical of information that they read online, but the onus is also on Microsoft to make sure the tools it is providing to millions of users aren’t just making stuff up and presenting it as if it’s true. Search engines like Bing are one of the best tools people have for verifying facts—they shouldn’t add to the amount of misinformation out there.

And that onus may be legally enforceable. The EU’s Digital Services Act, which will come into force some time in 2024, has provisions to specifically prevent the spread of misinformation. Failure to comply with the new law could result in penalties of up to 6 percent of a company’s annual turnover. Given the EU’s recent spate of large fines for US tech companies and existing provision that search engines must remove certain kinds of information that can be proved to be inaccurate, it seems plausible that the 27-country bloc may take a hard stance on AI-generated misinformation displayed prominently on Google or Bing. They are already being forced to take a tougher stance on other forms of generated misinformation, like deepfakes and fake social media accounts.

With these conversational AIs set to be widely and freely available soon, we are likely to see more discussion about how appropriate their use is—especially as they claim to be an authoritative source of information. In the meantime, let’s keep in mind going forward that it’s far easier for these kind of AIs to create grammatically coherent nonsense than it is for them to write an adequately fact-checked response to a query.

Win the Holidays with PopSci's Gift Guides

Break up with Microsoft 365—get the lifetime version that pays for itself Break up with Microsoft 365—get the lifetime version that pays for itself

Locate your wallet faster than Santa finds cookies — this tracker is only $27 Locate your wallet faster than Santa finds cookies — this tracker is only $27

iPhone users can also embrace Google and Microsoft apps. Here’s how. iPhone users can also embrace Google and Microsoft apps. Here’s how.

Controlling this robotic excavator is like playing a video game Controlling this robotic excavator is like playing a video game

Stanford researchers want to give digital cameras better depth perception Stanford researchers want to give digital cameras better depth perception

A popular mobile game is teaching scientists how we navigate our worlds A popular mobile game is teaching scientists how we navigate our worlds

Everything to know about North Korea’s latest missile test Everything to know about North Korea’s latest missile test

Google is launching major updates to how it serves health info Google is launching major updates to how it serves health info

Arizonians are the first Apple users to get digital wallet IDs Arizonians are the first Apple users to get digital wallet IDs

How urban warfare imperils utilities, public services, and civilians How urban warfare imperils utilities, public services, and civilians

Share

Win the Holidays with PopSci's Gift Guides