The logic behind AI chatbots like ChatGPT is surprisingly basic
Large language models, broken down.
CHATBOTS MIGHT APPEAR to be complex conversationalists that respond like real people. But if you take a closer look, they are essentially an advanced version of a program that finishes your sentences by predicting which words will come next. Bard, ChatGPT, and other AI technologies are large language models—a kind of algorithm trained on exercises similar to the Mad Libs-style questions found on elementary school quizzes. More simply put, they are human-written instructions that tell computers how to solve a problem or make a calculation. In this case, the algorithm uses your prompt and any sentences it comes across to auto-complete the answer.
Systems like ChatGPT can use only what they’ve gleaned from the web. “All it’s doing is taking the internet it has access to and then filling in what would come next,” says Rayid Ghani, a professor in the machine learning department at Carnegie Mellon University.
Let’s pretend you plugged this sentence into an AI chatbot: “The cat sat on the ___.” First, the language model would have to know that the missing word needs to be a noun to make grammatical sense. But it can’t be any noun—the cat can’t sit on the “democracy,” for one. So the algorithm scours texts written by humans to get a sense of what cats actually rest on and picks out the most probable answer. In this scenario, it might determine the cat sits on the “laptop” 10 percent of the time, on the “table” 20 percent of the time, and on the “chair” 70 percent of the time. The model would then go with the most likely answer: “chair.”
The system is able to use this prediction process to respond with a full sentence. If you ask a chatbot, “How are you?” it will generate “I’m” based on the “you” from the question and then “good” based on what most people on the web reply when asked how they are.
The way these programs process information and arrive at a decision sort of resembles how the human brain behaves. “As simple as this task [predicting the most likely response] is, it actually requires an incredibly sophisticated knowledge of both how language works and how the world works,” says Yoon Kim, a researcher at MIT’s Computer Science and Artificial Intelligence Laboratory. “You can think of [chatbots] as algorithms with little knobs on them. These knobs basically learn on data that you see out in the wild,” allowing the software to create “probabilities over the entire English vocab.”
The beauty of language models is that researchers don’t have to rigidly define any rules or grammar for them to follow. An AI chatbot implicitly learns how to form sentences that make sense by consuming tokens, which are common sequences of characters grouped together taken from the raw text of books, articles, and websites. All it needs are the patterns and associations it finds among certain words or phrases.
But these tools often spit out answers that are imprecise or incorrect—and that’s partly because of how they were schooled. “Language models are trained on both fiction and nonfiction. They’re trained on every text that’s out on the internet,” says Kim. If MoonPie tweets that its cookies really come from the moon, ChatGPT might incorporate that in a write-up on the product. And if Bard concludes that a cat sat on the democracy after scanning this article, well, you might have to get more used to the idea.
Read more about life in the age of AI:
- Will we ever be able to trust health advice from an AI?
- Why AI could be a big problem for the 2024 presidential election
Or check out all of our PopSci+ stories.