In the Pixar movie Up, a cartoon dog called Dug sports a magical collar of sorts that can translate his barks and whines into fluent human speech. Elsewhere in the real world, very well-trained dogs can be taught to press buttons that produce human speech for simple commands like “outside,” “walk,” and “play.” Humans have always been fascinated by the potential to communicate with the animals that they share the world with, and recently, machine learning, with its ever more advanced capabilities for parsing human speech, has presented itself as a hopeful route to animal translation.
An article in the New York Times this week documented major efforts from five groups of researchers that looked at using machine-learning algorithms to analyze the calls of rodents, lemurs, whales, chickens, pigs, bats, cats, and more.
Typically, artificial intelligence systems learn through training with labeled data (which can be supplied by the internet, or resources like e-books). For human language models, this usually involves giving computers a sentence, blocking out certain words, and asking the program to fill in the blanks. There are also more creative strategies now that want to match up speech to brain activity.
But analyzing animal language is a different beast from just analyzing human language. Computer scientists have to instruct software programs on what to look for, and how to organize the data. This process, for the most part, depends not only on accruing a good number of vocal recordings, but also on matching these vocal recordings with the visual social behaviors of animals. A group studying Egyptian fruit bats, for example, also used video cameras to record the bats themselves to provide context for the calls. And the group that’s studying whales plans to use video, audio, as well as tags that can record animal movements to decipher the syntax, semantics, and ultimately the meaning behind what whales are communicating and why. Of course, several groups have also proposed testing their animal dictionaries by playing recordings back to animals and seeing how they react.
Making a Google Translate for animals has been an aspirational project that’s been in the works for the better half of the last decade. Machine learning, too, has come far in terms of detecting the presence of animals and even in some cases, accurately identifying animals by call. (Cornell’s Merlin app is shockingly accurate at matching bird species to their calls.) And although this type of software has shown some success in identifying the basic vocabulary of certain animals from the characteristics of their vocalizations (ie. frequency or loudness) as well as attributing calls to individuals, it’s still a far cry from understanding all the intricate nuances of what animal language might encapsulate.
Many skeptics of this approach note both the shortcomings of current AI language models in being able to truly understand the relationships between words and the objects they may refer to in the real world, and the shortcomings in scientists’ understanding of animal societies at large. Artificial-intelligence language models for humans rely on a computer mapping out the relationship between words and the contexts they could appear in (where they might go in a sentence, and what they might refer to). But these models have their own flaws, and can sometimes be a black box—researchers know what goes in and comes out, but don’t quite understand how the algorithm is arriving at the conclusion.
Another factor that researchers are taking into account is the fact that animal communications might not work at all like human communications, and the tendency to anthropomorphize them could be skewing the results. There might be unique elements to animal language due to physiological and behavioral differences.
To this end of not being able to know the data parameters ahead of time, there are proposals for using self-supervised learning algorithms to analyze audio data, according to a report earlier this year in the Wall Street Journal, in which the computer tells the researchers what patterns it’s seeing in the data—patterns that might unveil connections that are missed by the human eye. Ultimately, how far humans go down the rabbithole of trying to understand animal communications depends on human goals for this type of research, and for that purpose it may be enough to get a handle on the basics. For example, a translator that can reliably interpret whether animals that we’re often in close contact with are happy, sad, or in danger could be both useful and more practical to create.