Languages are hard; it takes a trained ear to tease out not just the verbiage but the idiomatic expressions, the tone, the regional trends and ever-shifting insults that make a person truly fluent. This is one reason why even the best apps and Google Translate just can’t hack it. Similarly, it takes a trained linguist to know how these words, all sprouted from one root, still grow into endless forms all signifying the same thing. Can a cunning computer solve this problem as well as a smart linguist can? The answer, in this case, may be yes.
A new machine-learning algorithm can use sound rules to suss out the most likely phonetic changes in a shifting language. All words shift over time and place, but certain vowels and pronunciations are going to shift more than others--you say tomato, I say tomahtoe, Canadians say “aboot,” and so on. Alexandre Bouchard-Côté and colleagues at the University of British Columbia in Vancouver developed a system that can suggest how words may have sounded in the past, and which sounds were the most likely to shift. Then they compared the results with analysis by human experts, and found the 85 percent of the computer’s suggestions were within a single character of the correct words.
They looked at 637 distinct Austronesian languages, which span the Pacific from the Philippines to Hawaii. They would start, for example, with the word for “star.” In Fijian, the word is kalokalo. In Pazeh, a Taiwanese aboriginal language, it’s mintol. People who speak the Bornean tongue of Melanau call it biten, and those who speak the Filipino dialect called Inabaknon know it as bitu’on. The root word, from which all of these languages evolved, is bituquen. The computer deduced that correctly.The catch is that there’s a lot of front-end work before the computer can do its analysis. Linguists have to input a list of words in a given language, plus their meanings, and generate a sort of “tree of life” for language--a phylogenetic map showing how each word is related to the others. (It resembles in both form and function the phylogenetic map used by botanists and biologists to show how life is related.) But when it gets to work, the algorithm is efficient. It can recognize cognates, which are words with the same root, within languages, and then figure out the probable root.
The researchers acknowledge there’s still more advanced work to be done, but they hope it will be a boon to historical linguists the way genetic information has changed biology. Instead of morphological change--looking at a thing and seeing how it changes or compares to other things--is much simpler than looking at the genes. This algorithm can work in a similar fashion, computationally studying the roots of words and languages rather than using a specially trained ear. The paper appears this week in the Proceedings of the National Academy of Sciences.
Five amazing, clean technologies that will set us free, in this month's energy-focused issue. Also: how to build a better bomb detector, the robotic toys that are raising your children, a human catapult, the world's smallest arcade, and much more.


Online Content Director: Suzanne LaBarre | Email
Senior Editor: Paul Adams | Email
Associate Editor: Dan Nosowitz | Email
Assistant Editor: Colin Lecher | Email
Assistant Editor: Rose Pastore | Email
Contributing Writers:
Rebecca Boyle | Email
Kelsey D. Atherton | Email
Francie Diep | Email
Shaunacy Ferro | Email
While the ooo's and aw'sss are often said when learning ancient cultures and their written history; it is so amusing when science stubbles across references of anything Devine in nature, ( knowing the ancient culture\society did not have a scientific\technical vocabulary to describe what they saw or experience ) as mere fables or fantasy. Oh yes science will give wonderful credit in stating how amazing intelligent that culture was, but then if what was noted in the past as beings come down from the sky (GODS), they quickly fall back to calling it a myth, fantasy, fable, nonsense.
I am glad though this new technology is coming about and we can better understand ancient cultures and languages.
Let the truth be revealed!
I find it very curious that so many users on this site are believers of ancient aliens.
Daniken's work has been repeatedly debunked, he himself even says he borrows stories from other fictional works.
Interesting article, and an interesting application of technology.
Wanamingo,
Yea sure, scientist can conceive of intelligent life out in space now, just do not ask if they existed in the past, even though cosmos has been around 13.7 billion or so. That would just be crazy talk, lol.
Do any of the first three comments have any relavency to the article? Save it for the chat rooms please.
This is a great tool and I wonder if they have done any real model verification by going from contemporaneous English dialects and seeing how close they can get to King's English?
Nice article, Ms. Boyle.
Perhaps this can be used to predict the next "swings" of language as modern slang evolves into newer forms. In fact, I'm not surprised if this influences the next shift in language.
So, if English and Spanish are both Romance languages, it speaks Spanglish like we do down in Texas?