We never just hear music. Our experience of it is saturated in cultural expectations, personal memory and the need to move.
It’s easy to think about music as just a sequence of sounds – recorded and encoded in a Spotify stream, these days, but still: an acoustic phenomenon that we respond to because of how it sounds. The source of music’s power, according to this account, lies in the notes themselves. To pick apart how music affects us would be a matter of analysing the notes and our responses to them: in come notes, out tumbles our perception of music. How does Leonard Cohen’s Hallelujah work its magic? Simple: the fourth, the fifth, the minor fall, the major lift…
Yet thinking about music in this way – as sound, notes and responses to notes, kept separate from the rest of human experience – relegates music to a special, inscrutable sphere accessible only to the initiated. Notes, after all, are things that most people feel insecure about singing, and even less sure about reading. The vision of an isolated note-calculator in the brain, taking sound as input and producing musical perceptions as output, consigns music to a kind of mental silo.
But how could a cognitive capacity so removed from the rest of human experience have possibly evolved independently? And why would something so rarified generate such powerful emotions and memories for so many of us?
In fact, the past few decades of work in the cognitive sciences of music have demonstrated with increasing persuasiveness that the human capacity for music is not cordoned off from the rest of the mind. On the contrary, music perception is deeply interwoven with other perceptual systems, making music less a matter of notes, the province of theorists and professional musicians, and more a matter of fundamental human experience.
Brain imaging produces a particularly clear picture of this interconnectedness. When people listen to music, no single ‘music centre’ lights up. Instead, a widely distributed network activates, including areas devoted to vision, motor control, emotion, speech, memory and planning. Far from revealing an isolated, music-specific area, the most sophisticated technology we have available to peer inside the brain suggests that listening to music calls on a broad range of faculties, testifying to how deeply its perception is interwoven with other aspects of human experience. Beyond just what we hear, what we see, what we expect, how we move, and the sum of our life experiences all contribute to how we experience music.
If you close your eyes, you might be able to picture a highly expressive musical performance: you might see, for instance, a mouth open wide, a torso swaying, and arms lifting a guitar high into the air. Once you start picturing this expressive display, it’s easy to start hearing the sounds it might produce. In fact, it might be difficult to picture these movements without also imagining the sound.
Or you could look – with the volume muted – at two performances of the same piano sonata on YouTube, one by an artist who gesticulates and makes emotional facial expressions, and the other by a tight-lipped pianist who sits rigid and unmoving at the keyboard. Despite the fact that the only information you’re receiving is visual, you’ll likely imagine very different sounds: from the first pianist, highly expressive fluctuations in dynamics and timing, and from the second, more straightforward and uninflected progressions.
Could it be that visual information actually affects the perception of musical sound, and contributes substantially to the overall experience of a performance? Numerous studies have attempted to address this question. In one approach, the psychologist Bradley Vines at McGill University in Canada and colleagues video-recorded performances intended to be highly expressive as well as ‘deadpan’ performances, in which performers are instructed to play with as little expressivity as possible. Then the researchers presented these recordings to the participants, either showing them just the video with no sound, or playing them just the audio with no video, or playing them the full audiovisual recording – or, in a particularly sneaky twist, playing them a hybrid video, in which the video from the expressive performance was paired with the audio from the deadpan performance, and vice versa.
It turns out that participants tend to describe as more expressive and emotional whichever performance is paired with the more expressive video – rather than the recording with the more expressive sound. In a separate experiment, the psychologist Chia-Jung Tsay at University College London showed that people predicted the winners of music competitions more successfully when they watched silent videos of their performances than when they merely heard the performances, or watched the video with the sound on.
Pairing minor (sad) audio with major (happy) video leads to the minor music being rated as happier
Music, it seems, is a highly multimodal phenomenon. The movements that produce the sound contribute essentially, not just peripherally, to our experience of it – and the visual input can sometimes outweigh the influence of the sound itself.