How We Perceive Sound: The Ear

Auditory Scene Analysis:

Auditory scene analysis is the process by which we perceive the distance, direction, loudness, pitch, and tone of many individual sounds simultaneously.

Analyzing auditory scenes is a complex human ability. Our environment surrounds us with constant sound. Even the smallest vibrations and echoes help us to identify our surrounding area. Sounds in a small area produce fewer echoes than sounds in a large area. Physical properties of an object can also be determined by sounds the object makes. When a ball is dropped onto a soft surface, it makes a different sound than it would if dropped onto a hard surface. As you walk across the floor you can hear the change in the sound of your footsteps when you cross from a carpeted area onto a tiled surface.

The simplest way in which we can determine the location of the source of a sound is by comparing the intensity of the sound in our ears. If we hear a greater intensity (a louder sound) in the right ear, we know that the sound is coming from somewhere to our right. Conversely, a sound that is louder in the left ear than in the right is identified as coming from our left. We can also use the overall intensity of a sound (the combined intensity of the sound reaching the left ear and the sound reaching the right) to determine the proximity of the source of the sound. Simply put, a soft sound is determined to be coming from farther away than a louder sound. Both the comparison of left and right ear receptions and the evaluation of the sound's intensity are done automatically, without any conscious thought, allowing us to quickly and easily identify the approximate location of the origin of a sound.

We can further pinpoint a sound's position in space by using the ear-body-brain combination to decode localization cues. Localization cues are divided into two categories. There are dynamic cues, such as vision, reverberation, early echo response, and head motion. For example, sounds that originate close to us produce relatively few echoes compared to those that originate farther away. There are also static cues: shoulder echo, pinna response, head shadow, and interaural time difference. The pinna response refers to the fact that the pinna filters out certain frequencies of sound depending on the direction from which the sound comes. Sounds coming from the back may, for example, have their 1000 Hz frequencies filtered out by the back of the pinna. We perceive this as a subtle change in the quality of a sound, but we are used to having sounds coming from behind us filtered in this way. Because of this we are able to use this change in quality as a way to determine if the sound comes from in front of us, below us, behind us, or over us.

The recording of sounds has progressed from simple to more complex levels in an attempt to replicate the way humans perceive sound. Early monophonic recordings progressed to stereo. Newer technologies, such as 3D sound and other advances in the digital era, are refining the process further. These recordings, however, are still crude imitations of the process by which the human ear receives and understands sound.


Click here to explore this topic interactively


Back