Before my arrival, Owen had cranked up the machine to produce a neat pile of spectrograms from a 1998 ABC News interview with bin Laden, one of the only samples of the al Qaeda leader's voice that Owen considers 100 percent verified. The machine's stylus translated the acoustic energy of bin Laden's voice into a voiceprint, etching data across a paper strip attached to the machine's spinning drum.
Looking at the voiceprints, I can easily make out the scratchy, bar-shaped formants, or voice frequencies, produced by each syllabic utterance. The smudges resemble so many boxy notes stacked on an eight-line measure. The human voice doesn't emit single notes, Owen explains, but chords, or harmonics.
Owen hands me a spectrogram of the November al Jazeera broadcast. A storm of black lines covers the paper strip from top to bottom, end to end. With Owen's coaching, I imagine I can see the underlying formant bars, all but obscured behind a dark veil of background noise and broadcast carrier signals. A biometrics program could never sort through the noise, Owen insists. "They're designed to work with perfect samples." Cleaning up the tape won't work either, he says. "That's fine if all you want to do is hear what he's saying more clearly. But cleaning up background noise removes the high and low frequencies I need to make my identification." A biometric system demands the same frequencies, he says, and while he believes the NSA has obtained samples of bin Laden's voice that he is not privy to, he doesn't believe the agency has made biometric breakthroughs on the analysis side.
"I know for a fact they have things the FBI and the CIA don't have. But their technology is mostly devoted to listening," Owen says.
How certain can Owen's methods be with a short, poor-quality recording? Not only was the tape dirty, but there were only a half-dozen words in common between the November tape and the ABC interview. (The standards of the American Board of Recorded Evidence demand no fewer than 20 identical words -- preferably spoken in the same order -- to verify a positive voice identification.)
Owen notes that examining a spectrogram is only half of his job. His is the art of listening for the multitude of quirky mannerisms and pronunciation foibles peculiar to each voice. A trained ear can detect the subtle whistle caused by a missing tooth, a person's tendency to swallow in the middle of a sentence, even the way someone sets his or her jaw when speaking.
Owen plays me what he calls a short-term memory tape, a crucial tool in aural, or by-ear, voice identifications. The spliced tape toggles between 2.5-second segments of bin Laden's ABC interview and the scratchy al Jazeera broadcast; what Owen listens for -- what voice identification is based on -- are peculiarities in the way a voice expresses the formant structure, especially the vowels. "Same guy," says Owen. He insists bin Laden's voice is plenty peculiar but refuses to elaborate on those vocal quirks and risk giving impostors a road map.
To my untrained ear, it could be Darth Vader behind the static. All this seems somewhat ineffable -- a mixture of art and science understood by only eight sanctioned experts in the country. This is the sort of gray area that tends to make legal observers worry about the state of forensic science.
"Too often, I've seen cases of people wrongly accused of making threatening calls," admits retired Michigan detective Lonnie Smrkovsky, the acknowledged grandfather of forensic audio analysis. "I think at some point in time, we have to find a way to fully automate voice identification."
Five amazing, clean technologies that will set us free, in this month's energy-focused issue. Also: how to build a better bomb detector, the robotic toys that are raising your children, a human catapult, the world's smallest arcade, and much more.