and yet, we are nowhere near being able to electronically reproduce a live acoustic music performance. have you ever walked by a bunch of musical sound coming out of a room and thought to yourself, "wow, those live musicians sound great" only to discover it was just a stereo playing? nope.
as engineers we will never solve this problem as long as the "44.1kHz is good enough" dogma is perpetuated.
here's a question. why are frequency and bit depth the only two variables under discussion here? how does the human ear locate a sound in space? suppose I place a series of 20kHz tone generators along a wall (and that I can still hear 20kHz :) and trigger them at different times, and record the session in stereo at 44.1kHz with a standard X-Y mic setup. will I be able to reconstruct the performance?
>as engineers we will never solve this problem [reproduce a live acoustic music performance] as long as the "44.1kHz is good enough" dogma is perpetuated
It's the opposite. We are never going to solve this problem if we are going to focus on things that have nothing to do with the problem. Compare and contrast:
>as engineers we will never solve this problem as long as the "copper wires are good enough" dogma is perpetuated
Also, please read the article. The author specifically lists advances in audio tech they think are worthwhile to consider, such as surround sound. This actually addresses the problem you mentioned (reproducing the live performance) and the question you asked, i.e.
>here's a question. why are frequency and bit depth the only two variables under discussion here?
They are not, at least not in the article. Here it's because that's what's in the title, and not everyone gets to the end of the article.
Some comments do talk about the importance of having a good DAC for a good sound.
interesting viewpoint, however, did you think about the experiment I presented? without an answer, sample rate and cabling cannot be considered equivalent distractions on the road to high fidelity.
You’re talking about something orthogonal to the question at hand. It’s like complaining that the 4K TV sucks at VR.
Of course it does. It’s not meant to provide VR.
Same thing with sampling and bit-depth. Those address digital encoding of analog signals. They have nothing to say about speaker design, number of audio channels, room acoustics, or the myriad other factors that go into replicating a live stage performance.
It's not obvious that 2 channels of recorded audio aren't sufficient to recreate a convincing stereo image; suggesting that I'm seeking the equivalent of VR is specious.
And you haven't answered my question about the array of 20kHz tone generators. In fact, NOBODY has, and yet the question has been down-voted! How is that even possible? Posing a novel experiment which might invalidate the populist view considered harmful?
TFA's author is not active in the field of advancing man's ability to recreate live music more convincingly, AFAIK; he writes codecs. He believes people shouldn't purchase 192kHz downloads. He's certainly right that most consumers won't be able to tell the difference with their current equipment. But he makes no mention of the interaural time difference in human auditory perception, so he's already not telling the whole story. There is more to learn here, folks, and down-voting a question is an embarrassing failure of these forums. Why aren't posts in support of music piracy down-voted (read above)?
Regarding your question about the wall of tone generators:
I imagine a pair of microphones inserted into the ear canals of a dummy head should be able to capture what a real person sitting there would. Once the signals are captured, and assuming perfect frequency response of the microphones and noiseless transfer of the signals to an ADC, 44.1kHz would absolutely be enough to perfectly encode the 20Hz frequencies.
I put emphasis on the frequency response of the microphones. They’d have to match the frequency response of the human ear. Meaning they would not capture ultrasonics, just like our ears don’t.
I am less sure of the math behind bit-depth and how it relates to our own dynamic range. I also agree that if you intend to transform the recording, mix it with others, etc, then go ahead and encode at higher bitrates and even with a higher frequency (both mic and ADC sampling). But the final product, being sold for direct listening, need not be sampled at a rate that’s beyond our hearing. No more than a video recording should be able to encode colors in the ultraviolet spectrum (A better analogy than my previous one)
>TFA's author is not active in the field of advancing man's ability to recreate live music more convincingly, AFAIK; he writes codecs
As your other questions have been addressed by others, I simply would like to point out that this seems to be quite an arrogant stance to have.
The development of codecs has a lot to do with understanding of how the humans perceive sound, and how to effectively encode and reproduce sounds - which is useful even if you personally never listen to anything but analog recordings on analog systems.
However, we do live in a digital world, and one where codecs are a necessity. Codecs made recording, sharing, and distributing digital media at all possible - and now, they are making it possible to create better recordings by any metric you choose.
Consider this: bandwidth and space-saving that codecs give you allows you to record more data with the same equipment at the highest settings. That's why I don't have to think if I'll run out of memory I want to record 4-channel surround sound on my Zoom H2N (something that definitely goes towards a more faithful reproduction of being there than, say, bumping the frequency to 192kHz, which, incidentally, is the point of the article).
Unless you are there to record every live show, we'll have to rely on other people doing that - and guess what, they'll use codecs! How do I know that - that's because I do, they do, and the absolute majority of live show recordings that I've seen were not available in lossless formats. For that matter, good codecs contribute directly to the quality of the sound you'll hear.
Therefore, advancing the codecs does advance man's ability to recreate live music more convincingly.
So please, pause before dismissing other people's work.
>But he makes no mention of the interaural time difference in human auditory perception
He also doesn't mention how long it would take from Earth to Mars on a rocket, or the airspeed velocity of an unladen swallow. If you want to make a claim that this is somehow relevant to the question, you need to argue why, with sources - or simply ask the author, who might just answer.
>There is more to learn here, folks, and down-voting a question is an embarrassing failure of these forums. Why aren't posts in support of music piracy down-voted (read above)?
Not all questions are created equal. Your last question is an example of one that rightly deserves to be downvoted, as it contributes nothing to the discussion (of whether 192Khz really does anything for us), appeals to emotion, and derails the conversation off the topic. Please don't do that.
> Therefore, advancing the codecs does advance man's ability to recreate live music more convincingly.
Only where bandwidth and storage are constrained. If we're trying to push the state of the art, it's not going to be with a Zoom H2N.
The best music reproduction systems use lossless compression. Psychoacoustic compression does NOT get us closer to the original performance. I'm stating this as someone who gets 5 out of 5 correct, every time, on the NPR test:
(I'm ignoring the Suzanne Vega vocal-only track due to both its absence of musical complexity and use as test content during the development of the MP3 algorithm.)
While I appreciate xiphmont's codec work, I am dismissive of his open attempt to steer research and commerce in this area.
Why is his article posted as "neil-young.html"? Is that really fair?
> If you want to make a claim that this is somehow relevant to the question, you need to argue why, with sources - or simply ask the author, who might just answer.
Please see chaboud's excellent post above, referencing the work of Georg von Bekesy.
> Your last question is an example of one that rightly deserves to be downvoted
You're referring to my array-of-20kHz-tone-generators experiment? Sorry I don't know the answer, but I haven't done the experiment myself; I was hoping someone here had! Where's the appeal to emotion, though? If the experiment shows a higher sample rate is necessary (that's the whole point of the experiment) it's germane.
I.e. everywhere in this universe. There is not such thing as unlimited bandwidth/storage. Gains that codecs give allow us to record information that otherwise would be lost.
>If we're trying to push the state of the art, it's not going to be with a Zoom H2N.
I wish I could see the future so clearly!
I only have guesses, and my guess tells me that audio captured from 10 Zoom H2N's at 48kHz will store more information than audio from a single microphone at 480kHz. Current "state of the art" seems to use fewer channels. An advance in the state of the art in the direction of utilizing more sources seems more than feasible to me.
>Psychoacoustic compression does NOT get us closer to the original performance
I think you have missed my point. An uncompressed source is obviously not going to be better than the lossy-compressed data.
However, we do not live in a world of infinite resources. Given the constraints, compression offers new possibilities.
At the same space/bandwidth, you can have, e.g.:
- uncompressed audio from a single source
- compressed audio from 5x many sources
- compressed audio from 2x sources, plus some other data which affects the perception of the sound (???)
This plays right into your question "Why are we only considering bitrate/frequency?" - we don't. Compression offers more flexibility in making other directions viable.
This is why I believe that codec research is important for advances of the state of the art.
>I am dismissive of his open attempt to steer research and commerce in this area.
In what area exactly? What research? He is not "steering research", he is educating the less knowledgeable general public. So far, your dismissive attitude can also be applied verbatim to anyone who explains why super-thick-golden-cables from MonstrousCable(tm) are a waste of money.
>> Your last question is an example of one that rightly deserves to be downvoted
>You're referring to my array-of-20kHz-tone-generators experiment?
No, I was referring to this:
>Why aren't posts in support of music piracy down-voted (read above)?
xiphmont's primary goal appears to be to stop Neil Young from selling 24/192 audio to the general public; that's why he called the page neil-young.html. Sure, few buyers have the ears or equipment to pursue anything beyond the compact disc.
The problem is that many readers of neil-young.html will come away thinking they understand human hearing and digital sampling, when in fact the article is far too sparse on details to understand either; there is no discussion of how sounds are located in 3D space, or of how phase information is recovered. It is amazing that you can completely cover one ear, rub your fingers together behind your head and precisely pinpoint where your fingers are. It is also amazing that "Sampling doesn't affect frequency response or phase" but xiphmont doesn't explain this at all.
And then there's this lovely quote:
"It's true enough that a properly encoded Ogg file (or MP3, or AAC file) will be indistinguishable from the original at a moderate bitrate."
which is provably wrong. I can very reliably pick the uncompressed WAV each try when compared against 320kbps MP3.
My attitude is in support of furthering research in the area of live sound reproduction. As I've said, we are VERY far away right now. It is foolish to believe we understand human musical perception completely today. We cannot even replicate a simple cymbal strike with today's recording and playback technology.
I would encourage the curious to stand in the center of an outdoor arc of 100 horn players, like this (feel free to skip first 48 seconds):
Did you think about your experiment? What's your own conclusion, and what other conclusions would you expect others to make? "Just think about it" is not a very convincing argument.
The microphones would probably be the bottleneck in reproducing the sound. If your microphone setup doesn't perfectly model the ears of the listener (with respect to how the headphones are worn and their frequency response), you're not going to be able to plausibly reproduce the whole sound field using a stereo recording. That has little to do with sample rate, though.
In my own experience, the vast majority of consumer audio gear doesn't even take full advantage of 16/44.1. I've made live recordings of my classical chorus, and if I want it to be listenable on typical consumer gear, I have to apply a significant amount of dynamic range compression - otherwise, it will be either too quiet, or it won't handle the peaks when played at concert volume.
That being said, I'm using quite a bit less compression than the loudness-war-type mastering that is all too typical with pop music.
That's the funny thing about those tools -- applied properly, it can be a great asset. Applied carelessly, you end up with clipping and brickwalls of audio.
> have you ever walked by a bunch of musical sound coming out of a room and thought to yourself, "wow, those live musicians sound great" only to discover it was just a stereo playing?
Yes, I have. With the right combination of speaker setups and hi fi recordings, it is possible to fool yourself into believing there are musicians there.
Note that sitting in front of a speaker setup playing a carefully selected recording is not the same as the scenario above. We do not yet have the technology to fool someone walking by a room, especially for orchestral works or even an acoustic drum kit.
But I have experienced exactly this. Not an orchestra, but a recording of a band. I guess you can split hairs about the definition of "fool", but I have been fooled as you describe.
> and yet, we are nowhere near being able to electronically reproduce a live acoustic music performance.
That has a lot more to do with the spatial component of the audio than anything else.
Unfortunately, surround sound sufficient to really reproduce acoustic fields (and not just sound effects ping ponging around) require more cost and concessions in the listening room than many are willing to tolerate.
So long as people continue to get the engineering wrong and think the sampling rate and bit-depth have anything to do with it we'll probably continue to see the market invest in the wrong solutions.
as engineers we will never solve this problem as long as the "44.1kHz is good enough" dogma is perpetuated.
here's a question. why are frequency and bit depth the only two variables under discussion here? how does the human ear locate a sound in space? suppose I place a series of 20kHz tone generators along a wall (and that I can still hear 20kHz :) and trigger them at different times, and record the session in stereo at 44.1kHz with a standard X-Y mic setup. will I be able to reconstruct the performance?