Why conversational AIs can be factually inaccurate

Conversational AI-powered tools are going mainstream, which to many disinformation researchers, is a major cause for concern. This week, Google announced Bard, its answer to Open AI’s ChatGPT, and doubled down on rolling out AI-enhanced features to many of its core products at an event in Paris. Similarly, Microsoft announced that ChatGPT would soon be integrated with Bing, its much maligned search engine. Over the coming months these conversational tools will be widely available, but already, some problems are starting to appear.

Conversational AIs are built using a neural network framework called “large language models” (LLMs) and are incredibly good at generating text that is grammatically coherent and seems plausible and human-like. They can do this because they are trained on hundreds of gigabytes of human text, most of it scraped from the internet. To generate new text, the model will work by predicting the next “token” (basically, a word or fragment of a complex word) given a sequence of tokens (many researchers have compared this to the “fill in the blank” exercises we used to do in school).

For example, I asked ChatGPT to write about PopSci and it started by stating “Popular Science is a science and technology magazine that was first published in 1872.” Here, it’s fairly clear that it is cribbing its information from places like our About page and our Wikipedia page, and calculating what are the likely follow-on words to a sentence that starts: “Popular Science is…” The paragraph continues in much the same vein, with each sentence being the kind of thing that follows along naturally in the sorts of content that ChatGPT is trained on.

Unfortunately, this method of predicting plausible next words and sentences mean conversational AIs can frequently be factually wrong, and unless you already know the information already, you can easily be misled because they sound like they know what they’re talking about. PopSci is technically no longer a magazine, but Google demonstrated this even better with the rollout of Bard. (This is also why large language models can regurgitate conspiracy theories and other offensive content unless specifically trained not to.)

One of the demonstration questions in Google’s announcement (which is still live as of the time of writing) was “What new discoveries from the James Webb Space Telescope can I tell my 9 year old about?” In response, Bard offered three bullet points including one that said that “JWST took the very first pictures of a planet outside of our solar system.”

While that sounds like the kind of thing you’d expect the largest space telescope ever built to do—and the JWST is indeed spotting exoplanets—it didn’t find the first one. According to Reuters and NASA, that honor goes to the European Southern Observatory’s Very Large Telescope (VLT) which found one in 2004. If this had instead happened as part of someone asking Bard for advice and not as part of a very public announcement, there wouldn’t have been dozens of astronomy experts ready to step in and correct it.

Microsoft is taking a more up front approach. The Verge found that Bing’s new FAQ stated that ”the AI can make mistakes,” and that “Bing will sometimes misrepresent the information it finds, and you may see responses that sound convincing but are incomplete, inaccurate, or inappropriate.” It continues calling on users to exercise their own judgment and double-check the facts that the AI offers up. (It also says that you can ask Bing: “Where did you get that information?” to find out what sources it used to generate the answer.)

Still, this feels like a bit of a cop out from Microsoft. Yes, people should be skeptical of information that they read online, but the onus is also on Microsoft to make sure the tools it is providing to millions of users aren’t just making stuff up and presenting it as if it’s true. Search engines like Bing are one of the best tools people have for verifying facts—they shouldn’t add to the amount of misinformation out there.

And that onus may be legally enforceable. The EU’s Digital Services Act, which will come into force some time in 2024, has provisions to specifically prevent the spread of misinformation. Failure to comply with the new law could result in penalties of up to 6 percent of a company’s annual turnover. Given the EU’s recent spate of large fines for US tech companies and existing provision that search engines must remove certain kinds of information that can be proved to be inaccurate, it seems plausible that the 27-country bloc may take a hard stance on AI-generated misinformation displayed prominently on Google or Bing. They are already being forced to take a tougher stance on other forms of generated misinformation, like deepfakes and fake social media accounts.

With these conversational AIs set to be widely and freely available soon, we are likely to see more discussion about how appropriate their use is—especially as they claim to be an authoritative source of information. In the meantime, let’s keep in mind going forward that it’s far easier for these kind of AIs to create grammatically coherent nonsense than it is for them to write an adequately fact-checked response to a query.

500+ Xbox games for the price of a pizza? This Xbox Game Pass Ultimate deal is insane 500+ Xbox games for the price of a pizza? This Xbox Game Pass Ultimate deal is insane

Find out why this gadget just made VPNs obsolete Find out why this gadget just made VPNs obsolete

A new study shows how judges in Ireland used Wikipedia in their decisions A new study shows how judges in Ireland used Wikipedia in their decisions

Spotify is trying to figure out how our music preferences change as we age Spotify is trying to figure out how our music preferences change as we age

Facebook will try to make its news feed more personal again Facebook will try to make its news feed more personal again

Amazon targets Facebook groups to curb fake product reviews Amazon targets Facebook groups to curb fake product reviews

A new targeted attack can be used to ID anonymous website visitors A new targeted attack can be used to ID anonymous website visitors

Facebook’s new profiles feature will let users split up their accounts Facebook’s new profiles feature will let users split up their accounts

Social media giants are failing their LGBTQ users, advocacy group warns Social media giants are failing their LGBTQ users, advocacy group warns

A copyright lawsuit threatens to kill free access to Internet Archive’s library of books A copyright lawsuit threatens to kill free access to Internet Archive’s library of books

Meta’s new ‘system cards’ make Instagram’s AI algorithm a little less mysterious Meta’s new ‘system cards’ make Instagram’s AI algorithm a little less mysterious

iPhone users can also embrace Google and Microsoft apps. Here’s how. iPhone users can also embrace Google and Microsoft apps. Here’s how.

Tech can make your conversations with kids way more effective Tech can make your conversations with kids way more effective

Could cameras make urban bike lanes safer? Could cameras make urban bike lanes safer?

A US-UK agreement is changing how tech companies respond to law enforcement requests A US-UK agreement is changing how tech companies respond to law enforcement requests

Why you need an uninterruptable power supply for blackouts and brownouts Why you need an uninterruptable power supply for blackouts and brownouts

Alaska Air’s new electronic luggage tags could speed up airport check-ins Alaska Air’s new electronic luggage tags could speed up airport check-ins

Will digital dollars ever replace hard currency in the US? Will digital dollars ever replace hard currency in the US?

Share