--- The Magic in 2-Channel Sound --- The Importance of Directivity ---
Below is the paper that I wrote for the REPRODUCED
SOUND 2015 Conference of the Institute of Acoustics in the UK.
The Magic in 2-Channel Sound Reproduction
Figure 1. Headphone stereo. Localization of the auditory scene inside the head (a), and outside the head when the HRTF of the ear signals tracks with head movement.
Anechoic, reflection free spaces are essential for loudspeaker measurement and design; they are also essential to studying directional hearing effects. For instance, assume that two identical loudspeakers and a listener are set up in equilateral triangle fashion in an anechoic chamber. Figure 2. Identical signals are fed to each loudspeaker. If the listener's head had been blocked - i.e. not allowed to move - and there were no visual cues as to the loudspeaker location, then the listener hears a phantom source inside his head between the ears (a), just as with headphones.
If the listener is allowed to turn his head, he then perceives the phantom source in front of him (b2) at approximately the distance to the invisible loudspeakers. For highly directive speakers the center phantom may actually appear in front of the speakers (b1). If the source signal is noise and the listener moves a small distance laterally, then a change in tonality is heard due to changing interference of left and right loudspeaker signals at each ear (c). For greater lateral shifts (d) the signal collapses into the nearest loudspeaker and head turning confirms the location of the physical source of sound.
Figure 2. Loudspeaker stereo under anechoic conditions. Phantom inside clamped head (a). Externalized phantom with head turning (b). Tonality changes (combing) with small lateral shifts (c). Large lateral head shifts and jumping of phantom to the nearest speaker (d).
In the recording/mixing process,4,5,6 monaural sources are level panned to locations between the speakers and amplitude and phase differences between the outputs from one or more directional microphone pairs are used to render an acoustic scene between loudspeakers. An off-center phantom source, such as example (d) for a centered listener at (a), can be created by a larger amplitude signal from the left than the right loudspeaker. Below 800 Hz the superimposed loudspeaker signals at left and right ears are converted into timing differences7,8,9,10 between the ears (ITD) as if they were generated from a real source at location (d). Higher frequency content above 2 kHz with larger amplitude from left than right loudspeaker mimics the head shading (IID) effect and stabilizes the off-center phantom. But transient signals can quickly lead to identification of left or right speaker as the real source. It is a task for the mixing engineer to distribute a phantom scene between the loudspeakers. An acoustically dead room is generally preferred as work environment because it allows him to hear more clearly while making decisions. But it is an artificial environment.
2 ROOM RESPONSE
While anechoic conditions are useful for studying directional hearing, one must be careful to translate the findings directly to situations where multiple reflections of a signal occur. Again, this is a situation that is predominant in natural hearing and where evolution has optimized the signal processing between the ears for survival purposes. For example, it is important not to be distracted by reflections in finding the direction from which a sound is coming. Psychoacoustic research has shown that a first reflection11, which occurs shortly after the direct signal (within <25ms), must be stronger than the direct signal before it shifts the direction of the first arriving signal. A second12 reflection from a different direction has to be even stronger than the first reflection to shift direction. But later reflections (>30 ms) draw increasingly more attention unless their amplitude decreases with longer delays. This makes sense because late reflections could actually be coming from a second source.
A loudspeaker in a room produces a large number of reflections13 and perceptual issues become difficult to study in detail because of the large number of signal streams that arrive at each ear of a listener. Figure 3. Matters become even more complicated when the loudspeaker changes polar characteristics with emitted frequency; speaker L is typical for the vast majority of box loudspeakers. These speakers radiate omni-directional at low frequencies and become increasingly more forward directional with higher frequencies while maintaining a flat on-axis response. Consequently such box loudspeaker illuminates the room quite differently from a constant directivity dipole - for example speaker R. The dipole's reflections produce different superimposed sound streams at the ears of a listener even when they arrive from the same directions as those of a box loudspeaker in the dipole's location.
Figure 3. Direct signals and some of the reflections at the listener's ears for two types of loudspeakers: Dipole R with frequency independent radiation pattern and typical box loudspeaker L, which radiates omni-directional at low frequencies and becomes increasingly forward directional with increasing frequency.
Sound from a loudspeaker near the corner of floor and two walls produces at least 7 reflections13. Figure 4. Three of these are first order reflections, three are second order and one is of third order. In reality there would also be ceiling reflections and reflections from a speaker in the left room corner. The direct and reflected signals bounce around in the room building up the sound pressure level (SPL) of the reverberant field and reaching a constant level nearly everywhere in the room if the source signal is sustained in SPL.
|Figure 4. A dipole loudspeaker D near the corner of three intersecting surfaces and its images behind the perfectly reflecting surfaces. The images define the direction and the path length of the reflection. In combination with the polar diagram of the loudspeaker and the absorptive/diffusive characteristics of the surfaces the images also define the attenuation of the reflection at any point in front of the loudspeaker. First order reflections S, F, R, second order reflections S+F, R+F, R+S and third order reflection R+S+F.|
|Figure 5. Example of a 1.25 ms burst signal and its room reflections as they arrive at the listening position during the first 50 ms. The burst amplitudes are progressively attenuated vs. time as the signal has traveled greater distances and hit multiple surfaces.|
|Figure 6. As Figure 5 but on expanded time scale to show more clearly the decay of the room reflections of the initial burst signal at the listening position. The 3200 Hz narrow band burst decays 60 dB in 319 ms below the strongest initial reflection.|
Figure 5 and Figure 6 are real world examples giving an indication of the complexities with which the ear-brain hearing apparatus has to deal in order to find the direction and distance of the physical source. Obviously, fewer or weaker reflections make the task easier. With stereo and two loudspeakers we are not interested in the physical sources but the phantom sources and the auditory scene created by the direct loudspeaker signals. So the question becomes how to keep the reflections from becoming a distraction and how to move the room beyond the listener's acoustic horizon of attention.
Room Resonance Modes
Domestic listening rooms are acoustically small at low frequencies, where their largest dimensions are less than half of a wavelength. Sustained sounds will set up standing waves14,15 causing uneven distribution of SPL in the room. Figure 7. The position of a loudspeaker in the room, its low frequency radiation pattern, and the sound absorptive characteristics determine to what degree these resonant modes are excited. Whether the source radiates omni-directional, like a dipole or a cardioid, the longitudinal mode in Figure 7 will be set up. Only if the whole rear wall is totally absorptive or behaves like an open window will there be no standing wave. This is because there is no reflection back to the front wall. Standing waves are often problematic, particularly for a loudspeaker, which radiates more energy at low frequencies into the room than at higher frequencies like box speaker L in Figure 3. A loudspeaker which is directional even at low frequencies - like the dipole R in Figure 3 or a cardioid loudspeaker - changes the coupling to offending modes by turning it.
Figure 7. Standing wave (room resonance, longitudinal mode) example for a room of 6.88 m length. For a continuous 50 Hz tone listener (a) sits at a SPL minimum, which occurs at 1/4-wavelength from the rear wall. Listener (b) sits at a SPL maximum, but at 25 Hz he would sit in a minimum. Listener (c) against the rear wall is in the maximum SPL region for all room modes.
Furthermore if the loudspeaker maintains the same polar pattern over the whole frequency range, then energy distribution from low to high frequencies in the room is only a function of the room's absorptive characteristics. Thus, bass from dipole loudspeakers is reproduced with greater articulation16 and more evenness at different listening room locations than from box speakers. By articulation I mean that the envelope modulation of a bass signal is better preserved for different locations in the room. Any ambiance from the recording venue will be heard more clearly because the listening room is illuminated neutrally.
Reverberated Sound Field
If domestic rooms were simple rectangular boxes with known sound absorption coefficients for their boundary surfaces and without furniture in them, then it would be possible to predict14 the large number of modes that could be excited by a loudspeaker. Table 1. The number of modes 'N' increases with frequency as does the number of reflections since it takes two or more boundaries to set up a mode. In the example a) of a room with proportions deemed to have acceptable maxima and minima mode spatial distribution b), there could be up to 55 modes excited by the loudspeaker below fm = 150 Hz depending upon its placement in the room and its radiation pattern below 'fm', c). In general, as frequency increases the average frequency separation of modes 'df' decreases, being down to 1.6 Hz at 'fm'. Each mode has a 3 dB bandwidth 'bw' and corresponding reverberation time 'T60', which is determined by the wall absorption properties 'a' at the frequency of excitation, Table 1 d). A wall or surface absorption coefficient of 25% means that 1/4th of the room's surface area 'S' acts like an open window for sound to escape. That is a large equivalent area. It would have to be increased to 45% if a reverberation time of 250 ms were targeted, which is only practically achieved for this size room by the addition of bulk absorbers and resonant absorbers. Such short reverberation times are useful for mixing studios with conventional box type monitors. But for domestic listening and the type of controlled directivity loudspeakers that I discuss later18, they are not desirable. I have found a normally furnished room with diffusive and absorptive elements and a reverberation time around 450 ms to be optimal.
Assuming a 456 ms reverberation time the mode bandwidth 'bw' becomes 4.8 Hz, Table 1e). The bandwidth 'bw' is inverse proportional to 'T60' and is the same over the whole frequency range if the reverberation time is constant.
|Table1. Acoustic properties c) and e) of an unfurnished rectangular room with dimensions a) and estimated wall surface absorption d).|
The modes below 'fs' and the reverberated sound field above 'fs' build up with Trise = 0.32 T60 when sustained acoustic energy is supplied to the room15. It should be noted that 'Trise' is large compared to the duration of high frequency transients, meaning how well transients are heard at different locations in the room depends primarily on the dispersion of the direct sound from the loudspeaker and its reflections.
3 THE MAGIC IN STEREO
Typical Stereo Reproduction
Domestic listening rooms come in all shapes and sizes and rarely follow the simple model in Table 1. Except for a few of the lowest order modes and primary reflection areas, it becomes exceedingly difficult to make predictions about any room's potential acoustic behavior. Reverberation time is best measured and gives a general description for different frequency bands. How useful that is depends upon the loudspeakers that will be installed. I will even claim that how the room responds to sound is much less of a problem than how the loudspeaker illuminates the room.
How a loudspeaker will illuminate the room can in most cases be predicted by visual inspection of its shape, its physical dimensions, driver sizes and layout, which all help determine acoustic dimensions and diffraction effects. The vast majority17 of loudspeakers are constructed as rectangular boxes of various sizes, with narrow and tall front baffles, vertically aligned drivers and a vent either in front or in back. The tweeter and the design axis are positioned at seated ear height. Figure 8. There are variations to the front baffle design with rounded edges, a narrower baffle for the tweeter or staggered baffles for "time alignment" of the drivers. The driv