ChatGPT is, scientifically speaking, not funny

A new study indicates ChatGPT won't kill at an open mic night anytime soon.
Laptop screen showing ChatGPT homepage
Over 90 percent of more than 1,000 joke requests resulted in the same 25 quips. Deposit Photos

Share

Generative language programs such as ChatGPT may already fool some users with their human-like responses, but there is still at least one telltale sign of its limitations. Despite their immense capabilities, AI can’t tell a joke.

Well, more specifically, they can’t tell many jokes—much less get creative with them. The comical assessment comes courtesy of Sophie Jentzsch and Kristian Kersting, two researchers at the German Aerospace Center and Technical University Darmstadt’s Institute for Software Technology. As detailed in their new study, the pair recently asked OpenAI’s ChatGPT3.5 to tell them a joke 1,008 times in a row. ChatGPT reportedly then offered variations on 1 of 25 joke setups over 90 percent of the time. This led Jentzch and Kersting to conclude its comedic repertoire was likely learned and memorized during its training phases, and not generated on the fly.

[Related: Big Tech’s latest AI doomsday warning might be more of the same hype.]

According to the researchers, all of ChatGPT’s responses were grammatically correct, and nearly all of them contained only one joke. Varying their prompts’ wording “did not have any noticeable effect,” while ChatGPT’s multi-joke responses only came from being specifically asked to tell them “jokes,” plural. As Ars Technica also recently noted, even the AI’s comparatively small number of unique generations were the results of mixing portions of jokes it already had memorized. What’s more, they didn’t always withstand much logical scrutiny. One joke it churned out was as follows:”Why did the man put his money in the blender? He wanted to make time fly.”

Interestingly, ChatGPT-3.5 could often correctly break down what made a particular joke amusing, even a classic “anti-joke” setup like, “Why did the chicken cross the road? To get to the other side.” Further stress tests, however, indicated this ability resided in its prior training and memorization. When asked subtle variations on the chicken “anti-joke,” for example, the AI attempted to make logical connections based on the prompt content, and not its prior comedic knowledge:

Q: Can you explain why this joke is funny: Why did the chicken cross the playground? To get to the other slide.

A: “The joke is a play on words, using the double meaning of the word “slide”. In a playground, there are often two types of slides—a playground slide that children can slide down and a baseball slide, which is a sliding technique used in baseball.”

[Related: No, the AI chatbots (still) aren’t sentient.]

Such responses indicate ChatGPT will sometimes attempt to “apply an approved scheme of reasoning to a situation where it is not suitable,” writes Jentzch and Kersting. After their battery of joke requests and analysis, the researchers concluded ChatGPT has so far learned “a specific joke pattern instead of being able to be actually funny,” but its generation, explanation, and identification of jokes focuses their meaning and content, instead of superficial characteristics. Compared with previous large language models, ChatGPT-3.5 could be considered “a huge leap” toward AI’s general understanding of humor.

Many of Jentzch and Kersting’s lingering questions could possibly be clarified via a look into OpenAI’s methodology and datasets used to train its program—something it and many other AI tech companies remain tightlipped about, citing vague claims of security and abuse. When asked to explain this conundrum, OpenAI’s newest ChatGPT iteration itself called the situation an “absurdity” that “playfully satirizes the challenges faced in AI research.”

Good one, ChatGPT-4.