![](https://static.wixstatic.com/media/dc8464_d6404fee1fb542f1930b16b22dfbd6bf.jpg/v1/fill/w_980,h_735,al_c,q_85,usm_0.66_1.00_0.01,enc_auto/dc8464_d6404fee1fb542f1930b16b22dfbd6bf.jpg)
When selling our SoundSommeliers speakers which seem to be well received for their sound stage and stereo image, most customers simply like them straight out of the box and don´t worry much about setup. But every now and then I get asked on how to set the speakers up to make them sound their best. In particular to provide the most congruent sound stage. With that I decided to write some basic set-up instructions that may well be of interest to owners of different loudspeaker systems than ours.
Speaker set up is an incredibly complex topic with room nodes, standing waves, diffraction, comb filters and other attributes all playing a role. I don´t pretend to cover everything here but rather focus on how to set speakers up for best stereo imaging.
Things contributing to a great sound stage:
To be clear: the idea of an exact stereo image is like so many things in our audiophile world always doomed to be a less than perfect quest. Stereo image or sound stage, you see is a illusion created by our auditory system that let us interpret our acoustic environment for useful information for where a sound comes from. This system relies on several different clues as explained in this previous article.
Factor One: The recording
There are recording studios that particularly focus on good stereo imaging like our friends over at Chesky Records and HDTracks. They make some of the best imaging recordings I know that include all cues required by our brains to make sense of our acoustic surrounding. I am sure that at times they also employ some trickery like using the Haas Effect or Precedence Effect. Here the recording technician deliberately delays one channel by several milliseconds to emulate binaural group onset delays between both ears. This can be exaggerated and combined with other cues to extend the sound stage dramatically. On good systems, such sound stage can extend far beyond the actual distance between the speakers. As an example of what I consider at the limit of whats possible, check out Sting´s and Mary J. Blige´s dueto "Whenever I say your name". This song can be found at position 106 of our KVART & BØLGE Audiophile Reference Playlist and I recommend the bouncy sound sparks in the beginning as a test tune as we get into speaker setup. When done, you should be able to hear these sound effects covering a 180 Deg field while listening in an equilateral triangle with speakers that image well in the right setup.
Stereo image: Always an acoustical illusion. Realistic vs. Congruent.
A precise stereo image is actually an unachievable illusion. What we should really strive for instead is a congruent stereo image, or one that sounds precise and realistic. I say this for two reasons: One is that visual cues that correlate what we see and what we hear are constantly used to calibrate our auditory system to new environments. Seeing the un-amplified live solo violin in an actual orchestra, makes it rather impossible to not also hear it where we see it. Our brain takes care of that for us nicely. When listening to music on speakers there are no such visual cues and things may start to drift without much fault of the speakers. For the purposes of this article, we will therefore not focus on a precise stereo-image but rather that is congruent in the sense that it could have well happened the way we hear it now. We also focus on how to establish a stable stereo image. This is a stereo image that keeps its congruency over a wide area of listening positions and does not collapse with small head movements. You can compare the shift in stereo image by to looking through a magnifying glass that you hold before your head as you move your head back and forth. With the movements you will still be able to recognize a table perhaps or a chair, and make sense of their relative location and place in space and perspective. But things shift and move as you move your head behind the glass. This might serve as a good metaphor of the stability of a stereo image and the perfect, yet unachievable stability would compare looking through not a magnifying glass, but rather a flat piece of glass. While a very instable stereo image is highly annoying, as people of such systems spend more time hunting the "sweet spot" or perfect listening position. A good image with great stability does not need to be perfect. After all, it really doesn't matter that much, where that violin was exactly, but rather wether it has a precise place in three dimensional space helping us to hone into minuscule musical details by means of the "Cocktail Party Effect" that I talk about in an earlier article.
Set the Goal: A pleasant, congruent and stable stereo image and a wide deep sound stage.
As discussed we will not attempt to create a calibrated stereo image that precisely reproduces what was recorded from a spacial stand point, but rather a pleasant one as anything else is rather impossible to achieve in a normal listening environment without using headphones. The good thing about pleasant is that it is, in fact, "pleasant", and if an enjoyable experience of great music is your goal, "pleasant" is all we need. To define "pleasant" we should consider three attributes:
a precise sense of origins of sound in a three dimensional space.
a wide stereo image
a stable stereo image that is resistant to movements of the head or location
a wide area of pleasant listening positions within the room.
a sense of depth and spaciousness
disappearing loudspeakers as a source of sound.
Initial setup
A good baseline setup is starting in a setup that resembles a unilateral triangle with the speakers separated at least 80cm from any hard side wall and as far from the back as the room permits. Any soft non reflective surfaces between the walls an the speakers will help imaging.
Establishing a baseline and subjective measurement of stereo imaging.
Whenever going anywhere knowing where we are coming from is vital and to measure the quality of my set-ups stereo image I use the following admittedly subjective technique:
I chose a tune that I know to image well and widely. My reference tune is as mentioned earlier Sting´s and Mary J. Blige´s dueto "Whenever I say your name". This song can be found at position 106 of our KVART & BØLGE Audiophile Reference Playlist.
Sit in an unilateral triangle with your speakers in the current position and close your eyes. Listen to the bouncing sparks of sound in the song and spread out your arms to span the area over which you perceive the sounds moving. The angle your arms span, is your baseline. When done, a good stereo image using this technique should span beyond 180 degrees. Some sounds actually appear to come from behind.
Preparing the Room
Unless you listen to your music in an anechoic environment, your room will play a great role in stereo imaging. We are not only listening to any given musical instrument in the location it was recorded at with respect to the microphone. We are at the same time listening to recorded sound played twofold at unequal loudness and with differing delay coming from stereo loudspeakers. Considering how well stereophonic sound works despite this superposition of parallel realities is actually a psychoacoustic miracle and speaks for the deep preference of our brains to interpret our surroundings in a way that resembles what we have learnt is probable in the real world. The exact same source of sound split in two and originating from two differing points in space never happens in reality, so our brain prefers to blend the two together and perceive it as one. Works a charm.
A room can have a large effect on stereo imaging, one of the strongest impact being VERs or Very Early Reflections. Our auditory system relies on the time of onset of sound and the phase in which it reaches our ears. Incidentally a constant sound played at constant volume is almost impossible to localize as there is no difference in onset. Also very low frequencies where the wavelength is much larger than the distance between our ears. If such delay in onset is available, our brains will not use the wave fronts that reach our ears via reflection for imaging but rather localizes sound by focusing on the first onset of that sound alone. The delay needed is at least 5ms for a simple sound like a click or a coin dropping and can reach up to 40ms for complex orchestral works or grand pianos. Since we know the speed of sound to be roughly 340m/s, reflections from walls closer than 80 cm (0.005ms x 340 m /s * 0.5) are definitely going deteriorate stereo imaging while for a full maximum rendition of the Vienna Philharmonic Orchestra playing Mahler, you may need a room 20m across or more.
The source of the first very early reflections is usually not the room, but the loudspeaker cabinet in an effect called baffle step diffraction. Speakers that are designed for imaging like ours, take design measures avoiding large surfaces perpendicular to the membrane in both the front wall and back wall. They also use cabinet geomtry to reflect diffraction sond away from the lsitener. But considering that you already own a pair of loudspeakers that you would like to optimize use the following steps.
Speaker Setup
![](https://static.wixstatic.com/media/dc8464_14df7db91b65493692c845f9073dc259.jpeg/v1/fill/w_960,h_1280,al_c,q_85,enc_auto/dc8464_14df7db91b65493692c845f9073dc259.jpeg)
Set up the speakers with at least 80cm to each side wall and as far away from a back wall as you can reasonably accomplish. It helps to place non reflective surfaces between the wall and the speaker. The image shows my listening room and how a lamp and a rather large sofa with cushions work as acoustic elements while still making the space enjoyable to live in. The carpet is another acoustic element to prevent VER.
Place your listening position in a equilateral triangle with the loudspeakers. Directional speakers require great care with toe in and precision of the triangle, omnidirectional speakers
Testing the Room
If you have someone to assist you, you can test your room setup in the following way. Sit in your listening position with your eyes closed and have your assistant walk between the speakers making different sounds like click sounds, claps or such. Point at the perceived origin of the sound and open your eyes. You should be able to locate the sound precisely. If there are issues, the room might not be suitable for good imaging due to high reverberation or other problems. Rooms with unusually high ceilings or unusual aspect ratios tend to struggle in this area.
Testing the stereo image
Play Sting´s test track found at position 106 of our Audiophile Reference Test List and with your eyes closed determine the width of the field agin by holding your arms out to span the angle through which the sound moves. With a well imaging speaker you should get somewhere around 180 degrees.
Improving the image
Now move back and forth along the center line of the equilateral triangle and determine the position where the field is widest. If this is an architecturally feasible position to enjoy music, then you are done with the width of the image. On well imaging speakers in the right setup, you should be able perceive a field of over 170 degrees. On our SoundSommeliers in my listening environment I get about 190 degrees with some sound coming from the back. For our speakers, the widest stereo image is also the most stable and the most congruent one, but this may well be different for different setups.
Testing for depth.
The depth of the stereo image can be tested well by listening to Norah Jone´s "Meet me above Ground" (#103 on our Audiophile Reference Test List). The feathered snare drum appearing very early in the tune should be coming from the back of the room, in many cases from behind the back wall. If this is not perceived on a well imaging speaker, removing the speaker further from a back and side wall or placing absorbing material in between walls and speakers might help. When done well, a great stereo image helps you not only immerse in sound, but also listen to minute details in the music that go lost in a more blurred or condensed image.