Those music files -- be they MP3, AAC or WMA -- that you listen to on your portable music players are pretty crap when it comes to accurate sound reproduction from the original recording. But just how crap they really are wasn't known until now.
Audio data compression, at its heart, is pretty simple. A piece of software compresses a piece of digital audio data by chopping out redundancy and approximating the audio signal over a discrete period of time. The larger the sample time-period, the less precise the approximation. This is why an MP3 with a high sampling rate (short sample times) is of higher quality than an MP3 with a low sampling rate.
To test if the human ear was accurate enough to discern certain theoretical limits on audio compression algorithms, physicists Jacob N. Oppenheim and Marcelo O. Magnasco at Rockefeller University in New York City played tones to test subjects. The researchers wanted to see if the subjects could differentiate the timing of the tones and any frequency differences between them. The fundamental basis of the research is that almost all audio compression algorithms, such as the MP3 codec, extrapolate the signal based on a linear prediction model, which was developed long before scientists understood the finer details of how the human auditory system worked. This linear model holds that the timing of a sound and the frequency of that sound have specific cut-off limits: that is, at some point two tones are so close together in frequency or in time that a person should not be able to hear a difference. Further, time and frequency are related such that, a higher precision in one axis (say, time) means a corresponding decrease in the precision in the other. If human hearing follows linear rules, we shouldn't hear a degradation of quality (given high enough sampling rates -- we're not talking some horrible 192kbps rip) between a high-quality file and the original recording.
The experiment was broken up into five tasks that involved subjects listening to a reference tone coupled with a tone that varied from the reference. The tasks tested the following:
1) frequency differences only
2) timing differences only
3) frequency differences with a distracting note
4) timing differences with a distracting note
5) simultaneously determining both frequency and timing differences
I don't think it will come as a surprise to a lot of audiophiles, but human hearing most certainly does not have a linear response curve. In fact, during Task 5 -- what was considered the most complex of the tasks -- many of the test subjects could hear differences between tones with up to a factor of 13 more acuity than the linear model predicts. Those who had the most skill at differentiating time and frequency differences between tones were musicians. One, an electronic musician, could differentiate between tones sounded about three milliseconds apart -- remarkable because a single period of the tone only lasts 2.27 milliseconds. The same subject didn't perform as well as others in frequency differentiation. Another professional music was exceptional at frequency differentiation and good at temporal differentiation of the tones.
Even more interesting, the researchers found that composers and conductors had the best overall performance on Task 5, due to the necessity of being able to discern the frequency and timing of many simultaneous notes in an entire symphony orchestra. Finally, the researchers found that temporal acuity -- discerning time differences between notes -- was much better developed than frequency acuity in most of the test subjects.
So, what does this all mean? The authors plainly state that audio engineers should rethink how they approach audio compression -- and possibly jettison the linear models they use to achieve that compression altogether. They also suggest that revisiting audio processing algorithms will improve speech recognition software and could have applications in sonar research or radio astronomy. That's awesome, and all. But I can't say I look forward to re-ripping my entire music collection once those codecs become available.
Rather than doing this detailed research, wouldn't it be easier to switch back to vinyl and enjoy the soothing analog sound?
Yeah, we could switch to vinyl and enjoy the soothing analog pops, clicks, scratches and degradation of sound quality over time as the needle wore down the vinyl surface with each playing of a record. The only pristine record was one that had never been played or exposed to the elements. D/A converters and digital compression algorithms have their limitations, but those limitations are being overcome. This article illustrates yet another way the sound quality of compressed audio might be improved.
The first time I heard a CD, I couldn't believe the significant improvement in dynamic range, stereo separation, distortion, and most of all, the absence of the omnipresent low-level static that accompanied every vinyl recording I'd ever heard. It wasn't just a little better than vinyl, it was massively, mindblowingly, life-alteringly better. Vinyl is for nostalgia-lovers or collectors, not for people who appreciate quality sound and convenience.
Scientists studying acoustics and human hearing have had a solid understanding of this stuff since the 1950's.
I would have loved this entire article if it were not for the unnecessary use of crap. Since when was crap a scientific term? Let's keep it professional here.
Laurenra7 - I understand the economic side of preferring CDs or other digital media, but the sound difference you heard came from either bad record care or a difference in speaker quality. Because of the way records are pressed from the exact transduction of the sound, it holds all of the layers present in the original recording, whereas digitalization leaves the track with a hollower sound. Any audiophile could easily distinguish between vinyl and digital if they were played one after the other on the same speaker set.
There is no connection between this article and how good (or bad) audio compression is.
The published work appears to study human hearing to find what features of sound it is more or less sensitive to. In that it can provide information for developing compression techniques further to improve compression. The creators of mp3 did such research as well back in the day. And nowhere does the research presented in this article say that current compression algorithms produce bad quality audio - They haven't tested if the subject heard any difference between mp3 and the original, they tested the human hearing in general.
Current compression techniques may not be the most efficient and this research might help to improve that, but current encoding schemes are already designed to exceed human hearing, given that the target bit rate is high enough.
I have read several non-biased audiophile magazine tests that concluded that, for mp3, at 256kbps it was very hard for audiophiles to detect the difference between lossless and mp3, and at 320 it was impossible. Noone should expect high quality from 128-160kbps mp3.
"The larger the sample time-period, the less precise the approximation. This is why an MP3 with a high sampling rate (short sample times) is of higher quality than an MP3 with a low sampling rate."
While the first sentence is true, it has nothing to do with mp3, maybe the author confuses sampling rate with bit rate? Sampling rate (eg. 44100 or 48k) and resolution (eg. 16bit/persample) are universal to all digital audio or signal processing in general.
The bit rate of mp3 defines the target size of data to compress one second's worth of samples. Low bit rate means small file but high loss. High enough bit rate will result in the only loss in mp3 being fraction rounding, if that.
I am an electronics engineer and I have personal experience developing JPEG decoders that use similar techniques to mp3.
It's clear how biased this article is from the start:
"...that you listen to on your portable music players are pretty crap when it comes to accurate sound reproduction from the original recording. But just how crap they really are wasn't known until now."
In almost all cases, the people listening to FLAC or other audio sources suffer far far worse audio quality loss due to poor mp3 decoders and D/A converters, amplifiers and speakers/headphones than a the difference between high bitrate (256-320) mp3 and the original sound. To complicate matters, there are differences between the quality of the compressed audio by different encoders - two mp3-s from the same source bit all the same bit rates and setting but different encoder software will have differences in quality.
"But I can't say I look forward to re-ripping my entire music collection once those codecs become available. "
If you have high quality mp3-s, then you won't need to, unless you want even smaller files.
And I could go on...
Sorry for the long post.
[See first picture] Who listens to 128/kbps anyway? Bump it to 192 or higher and you're good to go.
In space, no one can hear a tree fall in the forest.
Dear Ms. Harbison,
It was an interesting reading indeed, thank you. But could you give us details of the original work based on which this article was written?
The author of this article does not follow up...