- Here is a listing of sites and pointers to references which are of some particular interest to
design, to sound reproduction in small spaces and to listening enjoyment. It
also refers to recording, music, amplifiers and other subjects of interest to
me. The links are ordered chronologically from oldest to most recent on top, the way I
- DNA and Its Epigenetic Potential,
an Antenna for Cosmic Emissions:
Driving Force in Evolution and Energy Transmission? - PDF
- Recommended reading
are three books well worth exploring, as you wonder about our place in
- Bruno Putzeys - of Hypex amps
and Kii-3 loudspeaker fame - wrote about MQA on facebook:
|This isn't a prelude to
suddenly becoming active on FB but I felt I had to share this.
Yesterday there was an AES session on
mastering for high resolution (whatever that is) whose highlight was a
talk about the state of the loudness war, why we're still fighting it
and what the final arrival of on-by-default loudness normalisation on
streaming services means for mastering. It also contained a
two-pronged campaign piece for MQA. During it, every classical
misconception and canard about digital audio was trotted out in an
amazingly short time. Interaural timing resolution, check. Pictures
showing staircase waveforms, check. That old chestnut about the ear
beating the Fourier uncertainty (the acoustical equivalent of saying
that human observers are able to beat Heisenberg's uncertainty
principle), right there.
At the end of the talk I got up to ask
a scathing question and spectacularly fumbled my attack*. So for those
who were wondering what I was on about, here goes. A filtering
operation is a convolution of two waveforms. One is the impulse
response of the filter (aka the "kernel"), the other is the
A word that high res proponents of any
stripe love is "blurring". The convolution point of view
shows that as the "kernel" blurs the signal, so the signal
blurs the kernel. As Stuart's spectral plots showed, an audio signal
is a much smoother waveform than the kernel so in reality guess who's
really blurring whom. And if there's no spectral energy left above the
noise floor at the frequency where the filter has ring tails, the ring
tails are below the noise floor too.
A second question, which I didn't even
get to ask, was about the impulse response of MQA's decimation and
upsampling chain as it is shown in the slide presentation. MQA's take
on those filters famously allows for aliasing, so how does one even
define "the" impulse response of that signal chain when its
actual shape depends on when exactly it happens relative to the
sampling clock (it's not time invariant). I mentioned this to my
friend Bob Katz who countered "but what if there isn't any
aliasing" (meaning what if no signal is present in the region
that folds down). Well yes, that's the saving grace. The signal
filters the kernel rather than vice versa and the shape of the
transition band doesn't matter if it is in a region where there is no
These folk are trying to have their
cake and eat it. Either aliasing doesn't matter because there is no
signal in the transition band and then the precise shape of the
transition band doesn't matter either (ie the ring tails have no
conceivable manifestation) or the absence of ring tails is critical
because there is signal in that region and then the aliasing will
result in audible components that fly in the face of MQA's
Doesn't that just sound like the
arguments DSD folks used to make? The requirement for 100kHz bandwidth
was made based on the assumption that content above 20k had an audible
impact whereas the supersonic noise was excused on the grounds that it
wasn't audible. What gives?
Meanwhile I'm happy to do speakers. You
wouldn't believe how much impact speakers have on replay fidelity.
* Oh hang on, actually I started by asking if besides speculations
about neuroscience and physics they had actual controlled listening
trials to back their story up. Bob Stuart replied that all listening
tests so far were working experiences with engineers in their studios
but that no scientific listening tests have been done so far. That
doesn't surprise any of us cynics but it is an astonishing admission
from the man himself. Mhm, I can just see the headlines. "No
Scientific Tests Were Done, Says MQA Founder".
My thoughts: Human hearing is a non-linear process of sound perception as
can be deduced, for example, from the equal loudness contours. Hearing
evolved for survival. High frequency ticks and clicks are instrumental in
determining the direction to the location of a potentially threatening
source. I wonder if we have hearing acuity for such type of signals that
goes beyond the frequency range for steady-state stimulus perception. I
doubt that hearing can be fully described in Fourier analyzer terms. If Bob
Stuart truly has discovered a new perceptual phenomenon, then he needs to
demonstrate it scientifically. Otherwise MQA is just a marketing ploy to
resell previously recorded material in a proprietary file format and they
for Phools. - SL
The Metrology of Quality, Quantity and Convenience
Story and Context
- A story of purpose
The two most important days of our lives ...
1 - The day we were born
2 - The day we found out why
I thought you may be interested in this brief
TED Talk if you have not already seen it.
It describes an experiment looking at what happens in the brain when a
subject is tasked with focusing on one of two visual elements (overt
attention) compared to when tasked to focus between them (covert
I would suspect this would have strong parallels
with audio stream segregation.
Hasson researches the basis of human communication, and experiments
from his lab reveal that even across different languages, our brains
show similar activity, or become "aligned," when we hear the
same idea or story. This amazing neural mechanism allows us to
transmit brain patterns, sharing memories and knowledge.
"We can communicate because we have a
common code that presents meaning," Hasson says.
What is so special about vinyl?
|My dear friend, Craig Allison,
founder, singer and slide guitar player of a local 'Bourgeois Blues
Band', and now selling 2-channel sound at Lavish
Hi-Fi in Santa Rosa, wrote to me:
We had a big vinyl event here last Saturday;
absolutely amazing, almost 200 people, about 1/3 of the doctors in
Santa Rosa were here.
We put on a great show, but I remained
quizzical as to the outrageous major buzz that manifested unlike any
other event we've had. I have preciously dwelt on this 'phenomenon' at
some length, and concluded that the distortion family of vinyl is
being re-embraced as an antithesis to the sound of highly compressed
Remember, you don't listen to a lot of
terrible current pop music, but the masses do.
But the reaction I picked up obviously went well past this one factor.
Had a great Facetime chat w/ a brilliant
friend of mine in Canada last night, talked about this.
And then it hit me: the significance of ritual, and what happens when
you take it away.
The public is overjoyed returning to a ritualized recorded music
As I enjoy certain rituals as well, I
understand that the same experience w/o ritual is simply not the same
experience at all. I love CD, but there is no palpable ritual
involved, and even less using a phone to bluetooth etc. ....
Yes, Craig, your thoughts resonate strongly with
We all know that music can touch and move us at a deep level. And
going through the preparations for playing a vinyl disc, then sitting
down in anticipation, is like opening a perceptual door, paying
attention, being ready to receive and to lose one's daily self.
Streaming a concert by the Berliner
Philharmoniker can have such an effect on me.
I just started reading: 'Music, The Brain And
Ecstasy - How Music Captures Our Imagination' by Robert Jourdain.
Dynamic Range: No Quiet = No
Music with a Cochlear Implant?
System Design" articles were originally published in Wireless
World, 1978, May,
Technology Trends - High Resolution
JAES, March, 2017
Art of Listening - from the artist's and producer's perspective
Acoustics for Listening - James Heddle, Acoustical Consultants,
Brisbane, Australia -
".... Acoustic design targets based on data and parameters derived from
single microphone measurements are therefore likely to have inadequacies and
may be misleading. Overall this implies our current understanding and models
of perceptual processes, including those involved with listening, have
significant room for improvement and are likely to more fully develop over
time. We should, therefore, be cautious of design based solely on room
acoustic parameters given in current international standards (Lokki,
Zones and Spheroids for Room Acoustics - - James Heddle, Acoustical
Consultants, Brisbane, Australia -
Abstract - The concept of Fresnel Zones arises from considering reflection
paths off a surface differing from the direct sound propagation path by some
multiple of half a wavelength. The modelling of these zones, and of zones
derived using a set time delay, provides useful insights for the design of
spaces for listening and communication. This paper gives an overview of
analysis using this approach together with some examples and is intended as
a companion paper to 'Room Acoustics for Listening'.
Loudspeaker Cabinet Diffraction
by Tore Skogberg illustrates the difficulties in analyzing and
predicting diffraction effects. Added to that are the finite dimensions and
modal breakup effects of real sources. But it is important to understand the
general trends in order to design sensible baffle shapes and to optimize
them by acoustic free-space measurements on-axis and around the
cabinet. See also My
Conversations with Fitz.
A Meta-Analysis of High Resolution Audio - Perceptual
ECMA-407 and Telecommunications
in the 21st Century
to Ecma TC32-TG22's Convenor and Swissaudec's CEO Clemens Par about
the 21st century's broadcasting and communication means.
The pdf of
the interview gives an introduction to the ingenious concept and
methodology behind the Ecma-407 standard for down-mixing f-channels of
audio into g-channels and then transmitting them using currently
deployed codec's like AAC or HE-AAC. On the receiving end of the
bit-stream transmission the g-channels can be up-mixed again to
f-channels or a smaller number of h-channels depending upon the
The process has higher proven performance than
UHD 3D-audio codec's currently under development. See the White Paper:
"Instant HD to UHD Audio".
Clemens Par gives credit to Rudolf E. Kalman and
Guenther Theile in the InterComms publication of: Rationalism
versus Empirism - A Crash Course in Invariant Theory.
Swissaudec exhibited in 'NAB Labs Future Park'
at the 2016 NAB Show
in Las Vegas.
I am a personal friend of Clemens Par, having met him first at TMT26
in 2010, where I was mightily impressed by his process for up-mixing a
mono audio signal to stereo as if recorded with a MS coincident
microphone pair. Since then much more powerful applications have
evolved out of Invariant Theory and inverse coding. But it pains me to
see how established audio standard setters resist to accept and
incorporate the new paradigm.
See also Rationalism
and swissaudec GmbH below.
Reverberation ...and how
to remove it
||The Feature Article by
Francis Rumsey in the April 2016 issue of the Journal
of the Audio Engineering Society ends with a section about THE
BENEFITS OF BINAURAL LISTENING IN REVERBERANT CONDITIONS (copy on
Auditory stream segregation is at play when the
listening room and the loudspeakers in a
proper stereo setup recede from perception and are moved beyond
the acoustic horizon of the listener.
Floyd Toole: Room
reflections and Human Adaptation for Small Room Acoustics
Floyd's article in www.audioholics.com
discusses hearing in reflective, resonant and reverberant spaces. I find it
refreshing to read:
"Humans evolved while listening in reflective spaces,
and are comfortable listening in them. In fact, it is now widely recognized
that we perceptually
"stream" the sound of the room as separate from the sound of the
sources - that is what happens in live performances. A Steinway is a
Steinway; only the hall changes. Performance halls generally don’t have
room mode problems because they are so large. The parallel situation in
sound reproduction is that a good loudspeaker is a good loudspeaker, and its
virtues are appreciated in a wide variety of rooms – except for the
differences in the bass region."
(The differences in the bass region largely
disappear when dipolar or cardioid woofers are used. - SL)
"As an illustration of how much loudspeaker technology has improved
over the years, these data on the JBL Pro M2 indicate that whatever one’s
opinions of loudspeaker/room interactions were in the era of the UREI, they
cannot be the same in the era of the M2, and any similarly “neutral”
loudspeaker. Because it is desirable
that the direct and reflected sounds resemble each other, the newer
loudspeaker has an enormous advantage. Traditions need to be put into
context, and some of them relegated to history."
("Neutral" ultimately means Constant Directivity over 4p
space. The M2 represents a step in the right direction, but is still
omni-directional at low frequencies and forward directional, though with
wide dispersion, at high frequencies. - SL)
I just wish more speaker designers would take seriously the
implications of the highlighted statements above.
See also a more recent article by Floyd "What
do listeners prefer for small room acoustics?"
Floyd Toole: The
Measurement and Calibration of Sound Reproducing Systems
In his Paper
you find out about traditions and the disappointing state of affairs in
professional audio. A discussion of this Open Access AES Paper is at
Seth Horowitz: The Universal
Sense - How Hearing Shapes The Mind
Here is an easy to read book,
written by a neuro-scientist for the general public, which describes the
response of the ear/brain perceptual apparatus to sound, to what draws our
attention, affects our emotions, our memory and possibly our actions. I
highly recommend this book to anyone involved with sound, whether in
production, rendering or listening. My loudspeaker designs for creating
convincing auditory illusions in ordinary rooms are intentionally based on
evolutionary hearing processes as described by Horowitz.
Optimizing the directivity
index of a 2-way loudspeaker
Diego Ivars Morón's Master's
Thesis at the Norwegian University of Science and Technology deals with
different approaches to obtain very wide dispersion from the tweeter section
of an otherwise omni-directional loudspeaker.
Acoustic power radiation from loudspeaker cabinets
Conventional box loudspeakers very often suffer from spurious sound
radiation, which is caused by the mechanical vibration energy of the drivers
being transmitted into the cabinet and exciting the cabinet walls to vibrate
at certain panel resonance modes. Furthermore, the high sound pressure
levels inside the cabinet can excite panel modes. Since the cabinet's
radiating surface areas are usually much larger than the driver cone area
even relatively small panel excursions can lead to significant
spurious acoustic output. Depending upon the cabinet construction the output
might even be larger at certain frequencies than the desired output from the
driver. In addition, air borne acoustic energy inside the cabinet, which is
very difficult to absorb and to turn into heat will escape via the thin cone
material of the driver and can color the sound. These problems taken
together and combined with a sub-optimum radiation pattern generates the
generic box loudspeaker sound. Conversely, an acoustically small,
open-baffle (dipole) loudspeaker with its minimal baffle area and no box
enclosure is largely free of spurious emission problems.
The Open Access AES paper "Predicting
the Acoustic Power Radiation from Loudspeaker Cabinets: a Numerically
Efficient Approach" deals with acoustic radiation due to structure
borne vibration energy in a highly braced cabinet. It is an interesting read
and shows how much attention must be given to cabinet details to minimize
website was updated by Helmut Wittek (Schoeps)
and relaunched in 2015 with an emphasis now on 3D audio. Unlike WFS or
Ambisonic approaches that aim for exact physical reconstruction of the
recorded soundfield here are microphone arrangements described, which
exploit the psychoacoustics of hearing and Gestalt recognition as originally
described in the groundbreaking work by Guenther Theile. While the emphasis
is on microphones and their characteristics
on the sound recording side, little is said about loudspeakers and rooms,
which after all are the other half of the story on the sound rendering side.
As far as I know conventional box loudspeakers have been used to assess the
recording techniques. This can only be justified by their popularity. But if
optimum 3D rendering is the goal then also loudspeakers, their
characteristics and setup, and the room have to be revisited!
Audio Natural Recording -
Techniques for 2.0 and 5.1 Ambience Recordings -
ORTF-3D Microphone technique for 3D ambience recording -
Rationalism versus Empirism
This publication by Clemens Par
in Issue 25 of InterComms
is a tribute to Professor Rudolf Kalman and was inspired by conversations
between them. It is also a tribute to Guenther Theile's groundbreaking work
in auditory spatial perception. Their mathematical contributions to systems
and invariant theory combined with new understanding of cerebral spatial
hearing processes have led to new forms of audio coding as now standardized
in ECMA-407 for UHD 3D audio.
I am pleased to know that Clemens considered the PLUTO's design and phantom
imaging, which he heard at TMT26 in 2010, as proof for his then new,
invariant method of upmixing from mono to stereo. See also below:
Ecma publishes the world's first 3Daudio standard:
Music Theory & MIDI Encoding
In this YouTube
video Charlie Gillingham (2015 NuPIC Hackathon) talks about the
challenges in machine decoding MIDI files, whereas the human brain's Hierarchical Temporal Memory