Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!


Click here to return to the 'Problems with your test methods ...' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Problems with your test methods ...
Authored by: BadgerUMD on Oct 21, '05 10:02:41AM

In reading your test methods, I have to say it seems as if each one is significantly flawed in some way (although you may have omitted details for brevity's sake). Warning -- this post is overly long! I apologize, but I thought that Gerk might want some constructive criticism.

>> "We did both binary bit comparison (cmp command) ..."

I can't see this test being meaningful for any comparison of the files involved -- since the files are of different formats, insignificant format differences would result in significant changes in the files. If you are comparing the output to the sound card, however, that *should* work (more about why it may not later).

>> "... the original WAV/AIFF against a compressed/decompressed one by putting the original sound in phase (0 degrees) and the comparing sound out of phase (180 degrees) ..."

I can see a LOT of ways this test could fail, and it wouldn't make Flac (or others) "un-lossless". Mostly they have to do with sampling and the way you took the sound 180 degrees out of phase (more on sampling later). Mostly though is because we don't live in a perfect world, from a signal processing point of view, and except for very simple waveforms, I don't think you could cancel out sounds this way anyway.

>> "For some simpler audio files it was lossless, for the specially crafted ones it was not. The specially crafted files contained things like white noise, generated audio tones (sawtooth, square wav, etc), spectrum sweeps and other complex audio samples."

I had the most trouble with this. There is NO digital format which can faithfully reproduce the sounds you hear. This is because all of the digital formats work by sampling the sound signal (usually at about 44000 Hz) and basically recording the signal's amplitude every 1/44000 seconds. Just doing this loses information -- there is absolutely no way (especially with complex sounds) you could *NOT* lose information doing this. This means that any "lossless" format needs to be lossy to be represented digitally. (although usually this can be decreased to where you can't hear it).

Then, the even bigger problem is that the true white noise, sawtooth, and square wave samples all have infinite frequency components (or at least WAY above 22000Hz, the highest frequency you can get with 44000 sampling). That means, when the audio sample is sampled, those higher frequency artifacts (which you can't hear anyway) are aliased to lower frequency components that you CAN hear. These artifacts may be absent from WAV if the WAV encoder you used pre-filters the incoming filter before sampling (a common practice) and present in other formats because they assume that no one would be encoding a sound that no one could hear anyway (which is what a CD's input to these encoders would be). So a more fair comparison would be to input only those sounds into the encoders which a human can hear -- white noise et. al. don't represent those sounds.

I hope this helps you revise your tests, or at least describe them in such a way that your article is well received.



[ Reply to This | # ]
Problems with your test methods ...
Authored by: c15zyx on Oct 21, '05 10:44:24AM

Some good points here. Having sounds compared by using phasing is kinda of pointless. In general when generating any kind of extra processing it's hard to keep track of possible added variabilities along the way.

All you need to do is take pcm data straight from the input wav and encode. Then decode to wav and extract and compare the pcm data. Since pcm data contains only raw audio samples, you can just bit-compare the results. Nice and simple. A difference between two values in this case can mean one of 3 things- the encoder is buggy, the decoder is buggy, or the underlying mathematical algorithms are spotty.



[ Reply to This | # ]