Orangutans’ distinct yells decoded with help from AI

These long and booming vocalizations are very individualized.
an orangutan sits in the forest with its mouth open wide, showing its teeth
Orangutans are the largest tree-dwelling mammals and are known for their vocal repertoire and social behavior. Deposit Photos

Share

While scientists may not have decoded the ABCs of sperm whales or what elephant gestures mean just yet, there is a lot of activity in studying animal communication. Now, it’s the orangutan’s turn. Through a combination of traditional recording and analysis with artificial intelligence (AI), scientists pinpointed three distinct types of pulses in the primate’s long call vocalizations. Their vocalizations are likely even more complex, according to a study published May 14 in the journal PeerJ Life & Environment.

Orangutans are great apes primarily found in Southeast Asia. They are the largest tree-dwelling mammal and primarily eat insects and flowers. They are also known for complex social behavior and communication. However, as with many species, understanding the nuances of their vocal repertoire has been a challenge.

[Related: Why do humans talk? Tree-dwelling orangutans might hold the answer.]

“Our research aimed to unravel the complexities of orangutan long calls, which play a crucial role in their communication across vast distances in the dense rainforests of Indonesia,” study co-author and Cornell University primatologist Wendy Erb said in a statement. “Over the course of three years, we accumulated hundreds of long call recordings, revealing a fascinating array of vocal diversity,” said Erb.

Their long calls are loud and booming vocalizations produced by males. They are highly variable depending on the individual and used to communicate across longer distances among widely spaced animals. Earlier studies put together a dictionary of the types of pulses. Using this same dictionary, Erb and her team sought to determine just how many pulse types they could describe, what features can distinguish them, and see how graded they are. 

Using these traditionally recorded video and audio sources, Erb and the team used machine learning to meticulously analyze the long calls of 13 individual orangutans. They were looking to determine the number of pulse types that are present in their vocalizations and evaluate their gradation.

“Through a combination of supervised and unsupervised analytical methods, we identified three distinct pulse types that were well differentiated by both humans and machines,” said Erb. “While our study represents a significant step forward in understanding orangutan communication, there is still much to uncover. Orangutans may possess a far greater repertoire of sound types than we have described, highlighting the complexity of their vocal system.”

[Related: Artificial intelligence is helping scientists decode animal languages.]

According to the team, this work underscores just how complex vocals in the animal kingdom are, noting that the pulses are not just random noises. It also demonstrates the power in collaborating across scientific disciplines and combining traditional research methods with advances in AI. 

“We hope that our findings inspire further exploration of vocal complexity across different species and pave the way for future discoveries in animal communication,” said Erb.