Analyzing timbre

So I’ve explained my rationale for analyzing timbre, and for specifically focusing on the Yamaha DX7, in another post; now it’s time to show this in action.

I base my analysis on the visual aid of the spectrogram. A spectrogram visually represents all the sounding frequencies on a two-dimensional graph, with pitch indicated in hertz on the y-axis and time represented on the x-axis, and loudness represented through color. Here is a spectrogram, paired with a transcription of the line you see in the spectrogram:

flute transcriptionflute

The header image for this website is a spectrogram, too (I used a prettier but maybe less useful color scheme for the header image). All those parallel lines are actually just part of one note. The loudest line—the thickest line with the white color—is the fundamentali.e., the pitch that we perceive as “the notes” that are being played. These are the notes that get transcribed above. All those other lines above and below it are partials, other frequencies that are actually sounding at the same time. These could be separated out as separate notes that are occurring at the same time, but instead they’re subsumed within the fundamental; we experience these other pitches not as pitched lines, but instead as part of the timbre of the fundamental.

Step 1: Find songs to analyze. After playing with my own DX7 for many hours, I’ve learned to identify the Yamaha DX7 presets by ear. I often start looking for songs by perusing, the archive for the UK’s Top 40 charts (I find their website more usable than Billboard’s, plus synthpop was more popular in the UK than in the US).

Step 2: Isolate the DX7 sounds. Once I’ve found a track with a few DX7 sounds in it, I hook my DX7 up to my computer and rerecord the synthesizer lines myself, to isolate them from the rest of the track.

This step is essential to get any clarity in the spectrograms. Here is what a spectrogram looks like for the entire composite track of “What’s Love Got to Do with It”:

Some things come out clearly, such as Tina Turner’s voice, the sustained bass line, and the hi-hat. But generally, it’s difficult to separate out what instrument is creating what visual aspect of the spectrogram. The flute line, a DX7 preset that is very salient in the middle of this clip, is almost impossible to see. But if I rerecord the clip myself, we get a much clearer image of the flute sound:

We can do a lot more with this! All the other clutter is out of the way, and the sound of the flute is clearly visually represented.

Step 3: Analyze the spectrogram. This is the most difficult part of my dissertation work, and it’s still very much under construction. I’m basing my approach on work by Robert Cogan. Cogan borrows an old approach from linguistics called “oppositional analysis.” Building from Cogan’s 13 oppositions, I have my own list of 20 or so oppositions that isolate one facet of the timbre of the sound and give it a negative or positive designation (these aren’t meant as aesthetic judgements; think of “negative/positive” like you do with a battery, not like with an Amazon review). I usually summarize these results in a table of plusses and minuses, and sometimes plus-minuses (±) or neutrals (∅). In the future I’ll write a post that focuses on my issues with Cogan’s oppositions, and brainstorm for future possibilities and alternatives. For now, let’s dive into an example analysis.

Tina Turner’s most well-known single, “What’s Love Got to Do with It”, sounds like a demo tape for the DX7. It uses four different distinctive DX7 presets: CALLIOPE, FLUTE 1, E. PIANO 1, and HARMONICA. It could practically be a demo tape for the DX7. The track peaked in the US at #1 in September 1984 and at #3 in the UK in June 1984. I like using this track as an example because it uses so much DX7 and it was so immensely popular, so it’s a good example of how the DX7 saturated popular music of this time.

The E. PIANO 1 preset is used the most in this track, constantly supporting Turner’s vocals. It’s mixed softly and doesn’t draw attention to itself. E. PIANO 1 is what I call a core timbre—a sound that’s used as the foundation of the track, like an electric guitar or drum set would be in a typical rock song. This is as opposed to a novelty timbre, which is used more sparingly for a coloristic effect. Core vs. novelty is not a distinction based on oppositional qualities or anything to do with the timbre itself. So it’s more of an orchestrational concern (how timbres are used) than a timbral one (what are the details that comprise this sound).

The CALLIOPE (heard in the intro and in verse 2), FLUTE 1 (heard in the prechoruses, and featured earlier in this post), and HARMONICA sounds (featured in the instrumental and in subsequent choruses) are all novelty timbres. CALLIOPE and FLUTE 1 both play some of the song’s hooks. They are all mixed very loudly in the track, and are only heard when they are replacing Turner’s vocals. The HARMONICA sound is used for a lengthy solo section, and afterward improvises some descant lines while Turner is singing.  Here are the spectrogram images for all four of these sounds (click to enlarge):

One interesting opposition for this set of sounds is narrow/wide, which captures a property of timbre often colloquially referred to as “brightness.” It refers to the distance between the fundamental and the highest sounding partial. CALLIOPE and FLUTE 1 are both narrow (dark) sounds, and they’re used in similar ways. They’re both novelty timbres that play short hooks that are only sounded while Turner is not singing. Dark sounds tend not to carry so well, and they also play in approximately the same range as Turner’s singing. This means that hearing both the FLUTE 1/CALLIOPE simultaneously with Turner would muddy the FLUTE 1/CALLIOPE. HARMONICA is a wide (bright) sound. This makes the HARMONICA  a good candidate for the extended solo that replaces Turner’s vocals for the instrumental. It also allows the sound to better compete with Turner’s voice when it improvises in the last choruses of the song.

E. PIANO 1 is a core sound, and it  is also wide sound, like the novelty timbre HARMONICA. E. PIANO 1 does not sound as aggressively loud as the HARMONICA, however. This is due of course to the volume of the two sounds, but also due to another timbral property: non-spaced/spaced. Most sounds have partials that are regularly occurring at certain frequencies. In hertz, the relationship  between the fundamental and partial 1, then partial 2, then partial 3, etc. is 1:2, 1:3, 1:4, etc. A sound that follows this general rule would be a non-spaced sound: all the expected partials are present. But not all sounds do this—some sounds skip some of these partials. E. PIANO 1 is one such sound. E. PIANO 1 has the first 5 partials above its fundamental as expected, but after this there is a big gap. The next sounding partials would be partials 11 and 12 if it followed the ratio pattern explained above. Aurally, a spaced sound is darker than a non-spaced sound, but can still seem somewhat bright if the sound is also wide, as E. PIANO 1 is.

 There are some problems that are inherent with spectrogram analysis; basically, the issue is what is most visually apparent in the spectrogram is not necessarily what is most aurally apparent. But I think in order for a theory of timbre to catch on, for other music theorists to want to do it, a visual medium is basically required. Visual aids are really useful for the kinds of analysis that music theorists like to do: we like to ponder music deeply and slowly, at our own pace. We like to be able to point to things that we can publish in a paper. This isn’t a bad thing, but it’s a limitation to be aware of.

This analysis above is, as of now, fairly basic. It was only meant to explain how my theory is supposed to work; it doesn’t provide any great insight into “What’s Love Got to Do with It” as a track. But there are many questions that I’ll investigate in my dissertation that an analysis like this might answer.

As I stated, the novelty sounds vs. core sounds designation doesn’t inherently have anything to do with timbral qualities. Nevertheless, there are often commonalities. In songs I’ve analyzed, core sounds are almost always 1) steady (not wavering) in pitch; 2) their partials conform to the harmonic series; 3) they don’t have a synthetic undertone; 4) they are sustained sounds (not clipped). Novelty sounds cannot be generalized, but in a way this is also distinctive: while core sounds are generalizable, novelty sounds are not. Are there songs where the core sounds have the opposite qualities from those I listed above? What happens if this relationship is somehow transgressed? It’s my suspicion that breaking this “rule” will sonically represent some kind of Other Thing. I know that the DX7 was used in sci-fi TV soundtracks of the 1980s, such as Dr. Who and The Twilight Zone. I want to look at which sounds are used as core sounds or novelty sounds in those soundtracks, compared to the way sounds are used in pop music.

Another question, one that relates more to my last post, is how does the timbral profile of the DX7 compare to other (analogue) synthesizers being used in the 1980s, or in other decades? How do these oppositions define that 80s “sound” that is so distinctive and polarizing? I’ve been looking at issues of Keyboard Magazine and New Musical Express from the mid-1980s. Discussions of the DX7 abound, and one theme is recurring: digital FM synthesis (the technology used in the DX7) sounds “cold” compared to the analog synthesis used in other synthesizers. What features of the DX7 (and by extension, FM synthesis) are contributing to this consensus? 

I can also expand the instruments I’m investigating. Another major component of the 80s “sound” was the prevalence of drum machines like the Linn Drum and the legendary Roland TR-808. Like the DX7, these drum machines had pre-programmed sounds that users relied on, and which I could easily reproduce. How do the “fake” drums and bass sounds of these machines compare to “real” sounds produced on acoustic instruments and non-synthesizer electric instruments? This would also work toward a precise definition of an 80s sound. For now, I’m leaving all these questions unanswered. In the future, as I make cool conclusions based on this method of analysis, I hope to share tidbits on this blog.