Left: The Midomi team—Majid Emami, James Hom, Amir Arbabi, Michal Grabowski and Kamyar Mohajer—harmonize into their site, which identifies sung tunes. [Not pictured: CEO Keyvan Mohajer.]
The Music Mind Readers
Their site puts a name to that song stuck in your head
Five years ago, Michal Grabowski was strumming a guitar in a dorm room while a pal, James Hom, kicked around ideas for a Stanford University business-plan contest. As Grabowski played a few bars of a song he couldn’t quite recall, his absent-minded chords struck a chord. What if you could look up a song simply by humming it? The pair submitted the idea and nearly won, but there was one hitch: They weren’t sure if they could make it work.
So the two undergrads turned to Majid Emami and Keyvan Mohajer, electrical-engineering Ph.D. students with audio expertise. For the next two years, the group toiled to create its name-that-tune software. During three grueling weeks in 2004, Emami and Mohajer went outside only once, to buy burritos on Christmas Day. But by New Year’s, they were toasting a novel solution that begat a company, Melodis, and the site midomi.com.
To find the song you’re singing to your computer, the site compares the sound of your voice directly with the sounds of popular songs as hummed, sung, or even whistled by other people. It’s an approach that’s simultaneously simpler and more creative than earlier music-search efforts. Many researchers had tried to convert humming to sheet music, a computer-friendly set of symbols. But that translation proved difficult unless people sang in unnaturally separated syllables. Even then, searches often failed because most systems had only small collections of written music in the public domain.
The team instead created software that analyzed songs based on pitch, timing, lyrics and other features that didn’t force people to sing unnaturally. Then—at first by enticing friends into online karaoke and offering Amazon gift cards—they convinced users to contribute 200,000 tunes. It’s a solution based not just on smart algorithms but on our collective urge to sing in the shower.
Comparing voices to voices works well because people usually remember the same snippet of a song’s melody. “What is melody?” asks Emami, now Melodis’s vice president of engineering. “It’s perception. Sometimes the human brain even fills in parts of the song that aren’t there.” Midomi will still recognize it if others have hummed the same wrong notes—which a sheet-music search would probably miss. Today midomi.com is a kind of social-networking site for amateur pop stars, as well as the world’s most extensive and accurate music-search service, with a 95 percent success rate.
But its technique could be applied to, for example, a virtual American Idol that would mathematically rate voices, or the translation of tricky tonal languages such as Cantonese. (Because it senses both words and pitch, it could tell apart phrases whose meanings differ depending on inflection.) Ultimately, its founders want our voices to replace keyboards and mice. “Singing is really fun,” says CEO Mohajer, but to him, it’s just a step toward starting the conversation with our machines.—MATHEW HONAN