Music & Noise

Discussion

Music has charms to sooth a savage breast, to soften rocks, or bend a knotted oak. I've read, that things inanimate have mov'd, and, as with living souls, have been inform'd, by magic numbers and persuasive sound.

William Congreve, 1697

synthesis and analysis

The distinction between music and noise is mathematical form. Music is ordered sound. Noise is disordered sound.

Music and noise are both mixtures of sound waves of different frequencies. The component frequencies of music are discrete (separable) and rational (their ratios form simple fractions) with a discernible dominant frequency. The component frequencies of noise are continuous (every frequency will be present over some range) and random (described by a probability distribution) with no discernible dominant frequency.

[slide]

Sound is a longitudinal wave, which means the particles of the medium vibrate parallel to the direction of propagation of the wave. A sound wave coming out of a musical instrument, loudspeaker, or someone's mouth pushes the air forward and backward as the sound propagates outward. This has the effect of squeezing and pulling on the air, changing its pressure very slightly. These pressure variations can be detected by the ear drum (a light flexible membrane) in the middle ear, translated into neural impulses in the inner ear, and sent on to the brain for processing. They can also be detected by the diaphragm of a microphone (a light, flexible membrane), translated into an electrical signal by any one of several electromechanical means, and sent on to a computer for processing. The processing done in the brain is very sophisticated, but the processing done by a computer is relatively simple The pressure variations of a sound wave are changed into voltage variations in the microphone, which are sampled periodically and very rapidly by a computer and then saved as numbers.

A graph of microphone voltage vs. time (called a waveform) is a convenient way to use a computer to see sound. Before the rise of ubiquitous digital computers, waveforms were often analyzed electronically using an oscilloscope — a cathode ray tube with an electron beam that traced voltage as a function of time on a fluorescent glass screen. Oscilloscopes are basically simplified televisions with one purpose (to a draw time series or parametric graph) and one color (usually bright green). This task is easily mimicked by Twenty-first Century desktop, laptop, and tablet computers as well as smart phones. Oscilloscope applications on these devices often pay homage to their analog ancestors by using a green color scheme.

The simplest sound to analyze mathematically is the pure tone — one where the pressure variation is described by a single frequency. A pure tone would look like a sine curve when graphed oscilloscope style.

y = A sin 2πƒt

Where …

y =  the instantaneous value of the microphone voltage (V), which is directly proportional to the variation in air pressure (P) due to sound waves impacting the microphone
A =  the amplitude, or maximum value of the waveform
2π =  a constant needed to get the units to work out
ƒ =  the frequency of the pure tone
t =  time, of course
Oscilloscope traces of pure tones
40 Hz pure tone 100 Hz pure tone 315 Hz pure tone

Music in its simplest form is monotonic; that is, composed only of pure tones. Monotonic music is dull and lifeless like a 1990s ringtone (worse than that even); like a 1970s digital watch alarm (now we're talking); like an oscillating circuit attached to a speaker built by a college student in an introductory physics class (so primitive). Real music, however, is polytonic — a mixture of pure tones played together in a manner that sounds harmonious. A sound composed of multiple frequencies like that produced by a musical instrument or the human voice would still be periodic, but would be more complex than just a simple sine curve.

Oscilloscope traces for various instruments
bass voice singing tenor voice singing soprano voice singing
pipe organ trumpet violin
conga drum high hat cymbal woodblock

The human voice and musical instruments produce sounds by vibration. What vibrates determines the type of instrument.

category vibrating part examples
idiophone whole
instrument
bell, cymbal, musical saw,
wood block, xylophone
membranophone stretched
membrane
drums, kazoo, human voice
chordophone stretched
string
strings (violin, guitar, harp), piano
aerophone air
column
woodwinds (saxophone, flute),
brass (trumpet, tuba), organ
electrophone electric
circuit
synthesizer, theremin
Classification of musical instruments

Like many other mechanical systems, musical instruments vibrate naturally at several related frequencies called harmonics. The lowest frequency of vibration, which is also usually the loudest, is called the fundamental. The higher frequency harmonics are called overtones. The human auditory system perceives the fundamental frequency of a musical note as the characteristic pitch of that note. The amplitudes of the overtones relative to the fundamental give the note its quality or timbre — pronounced in English as tæmbər or in quasi-French by English speakers as tɛ̃br with a nasal ɛ̃ for the medial e and silence for the final e. Timbre is one of the features of sound that enables us to distinguish a flute from a violin and a tuba from a timpani.

Recall that when waves meet they don't collide like material objects, they pass through each other like spectres — they interfere. Interfering waves combine by the principle of linear superposition — basically, just add the values of one function to the values of another function everywhere in mathematical space. With the right combination of sine and/or cosine functions, you can make functions with all kinds of shapes (as long as they're functions). Several examples were illustrated earlier in this book. Here they are again …

square wave sawtooth wave triangle wave
y = ∑  1  sin(2n − 1)x
2n − 1
y = ∑  (− 1)n +1  sin nx
n
y = ∑  1  cos(2n − 1)x
(2n − 1)2

The act of combining pure tones together to produce a complex waveform is called additive synthesis. Non-electronic instruments do this naturally. Electronic instruments (if they want to be taken seriously) are designed with additive synthesis in mind. The analog synthesizer is a good example of this.

Generate a sinusoidal electric signal of with an oscillator — a fairly simple electric circuit composed of a capacitor (C) and an inductor (L), also known as an LC circuit. Using a second oscillator generate a second signal at a different frequency. The second harmonic would be a wise choice. Add a third, fourth, fifth, sixth and seventh oscillator for the corresponding higher harmonics. Synchronize them electronically so that when the pitch of the fundamental oscillator changes, the frequencies of the overtone oscillators follow along. Adjust the relative amplitude of each oscillator to produce a sound with the desired timbre. Attach the circuit to an amplifier and a loudspeaker so it can be heard. Attach the fundamental oscillator to a control device that turns it on and off and changes its frequency as needed. Make your control device look like a piano keyboard so musicians will use your device as a musical instrument. Call it a synthesizer since it uses the principles of additive synthesis. Let the Seventies begin.

It is mathematically possible to transform any waveform from a continuous sequence of values on a time-series graph into an infinite series of discrete sine and/or cosine functions. The process is called spectral analysis or Fourier analysis after its inventor, the French mathematician and physicist, Joseph Fourier (1768–1830). Taking a sound apart through spectral analysis is the inverse process of putting a sound together through additive synthesis.

Start with an arbitrary periodic function and assume that it was made from the linear superposition of an infinite number of sine and cosine functions in a harmonic series. What coefficients should each harmonic be multiplied by to give the desired function? The process for answering this question is called a Fourier transform and it is best left to a mathematician to explain.

Any periodic function ƒ(t) with a period of T can be written like this …

ƒ(t) = a0 +   ancos
nt
 + ∑ bnsin
nt
T T

Where …

T
a0 = 1
ƒ(t) dt
T
0
T
an = 2
ƒ(t) cos
nt
 dt
T T
0
T
bn = 2
ƒ(t) sin
nt
 dt
T T
0

… and n = 1, 2, 3, … and that's all I want to say about that.

Fortunately for us, relatively simple algorithms for solving this problem were discovered soon after Fourier invented the technique, and computer scientists have been coding and sharing fast Fourier transforms since the 1960s. Audio applications with FFT are easily found online for many computing platforms — including smart phones, which means that sound spectral analysis is now literally with everyone's grasp.

The graph above shows the spectral analysis of a female voice. Sounds of all sorts of different frequencies are being produced — some more intensely than others. It appears that this particular voice was strongest at 270 Hz (C♯4 for those of you who know music) and multiples of 270 Hz (540 Hz, 810 Hz, 1080 Hz, 1350 Hz, 1620 Hz, and 1890 Hz). This graph has been normalized, which exaggerated the peaks at the higher frequencies so they could be seen. I like this graph because the spacing between peaks is just so perfect. This kind of structure is what makes a sound musical. Let's look at some more examples.

[slide]

A flute is essentially a tube that is open at both ends. Air is blown across one end and sound comes out the other. A spectral analysis confirms this. The harmonics are all whole number multiples of the fundamental frequency (436 Hz, a slightly flat A4 — a bit lower in frequency than is normally acceptable). Note how the second harmonic is nearly as intense as the fundamental. This strong second harmonic is part of what makes a flute sound like a flute.

[slide]

A recorder is also a tube with two open ends. It produces a sound similar to a flute, but not exactly the same. Again the harmonics are whole number multiples of the fundamental frequency (923 Hz, a very sharp A5 — much higher in frequency than is normally acceptable), but for some reason the second harmonic is nearly non existent. This nearly missing second harmonic is part of what makes a recorder sound like a recorder and not sound like a flute.

[slide]

A tuning fork is forked; that is to say, it splits from its handle into two branches called tines. Each tine is fixed to the handle at one end, but is free to vibrate at the other. As a result, one would expect to find only those harmonics that were odd multiples of the fundamental in the spectrum of a tuning fork. This is what the spectral analysis shows. The even harmonics are present, but are they are extremely weak and are probably due to the sympathetic vibrations of something nearby. This spectra was produced by striking a large, demonstration-size tuning fork (not the one pictured above) with an excessively heavy blow. Tuning forks should always be tapped lightly and on a resilient surface. Doing so reduces the intensity of the "ping" overtones, which is a desirable thing. An ideal tuning fork would vibrate at just one frequency. The tuning fork used in this experiment was rated at 256 Hz, but the spectral analysis software picked up 259 Hz for the fundamental. Is the tuning fork out of tune or is the software in error?

noise

Music is sound with a discrete structure. Noise is sound with a continuous structure. Music is composed of sounds with a fundamental frequency and overtones. Noise is composed of sounds with frequencies that range continuously in value from as low as you can hear to as high as you can hear (not necessarily at equal intensity however). Music is described mathematically by an infinite sum of sines and cosines multiplied by appropriately valued coefficients (infinite mathematically, but in practice only a handful of overtones really matter). Noise is described by a spectral power distribution, much like the statistical distributions of kinetic molecular theory. Music is ordered. Noise is random.

Noise is what you hear when you tune an analog radio or television to an empty frequency. It's the overall sound of rain falling on leaves, soda bubbling in a glass, air escaping from a tire, or a crowd applauding.

"Noisy" describes some voiceless consonants used in English better than "musical".

sound ipa* example name
/s/[s] sin voiceless coronal sibilant
/sh/[ʃ] shin voiceless palato-alveolar sibilant
/f/[f] fin voiceless labiodental fricative
/th/[θ] thin voiceless dental non-sibilant fricative
/h/[h] hello voiceless glottal transition
Consonants in English with noisy structure International Phonetic Alphabet

Noise might not be periodic, but it can still be analyzed with a fast Fourier transform — as long as we only examine a finite amount of it. Since infinite quantities don't exist in the real world, this isn't a problem. Mathematics allows for more possibilities than physicists will never encounter.

Spectral analysis of …
white noise pink noise concert applause
Frequency on the horizontal axis. Intensity on the vertical axis.

Noise might not be ordered, but that doesn't mean it can't be described. Frequency is to sound as color is to light. We hear different frequencies of sound as different pitches (A, B, C, D, E, F, G; for example). We see different frequencies of light as different colors (red, orange, yellow, green, blue, and violet; for example). The analogy is not perfect. (What analogy is?) The notes of the musical scale repeat themselves for every doubling of frequency (a topic I'll come back to). The frequencies of visible light span so narrow a range, they never get a chance to double. The frequencies of the notes of a musical scale are related by simple numerical fractions. The frequency bands associated with specific color names show no mathematical relationships. This analogy isn't looking very promising.

In much the same way that most sounds we hear are polytonic (composed of many frequencies), most light we see is polychromatic (composed of many frequencies). Sources of light that are hot enough (like the sun) will emit a mixture of frequencies that span the entire visible spectrum. The color of this light is white — a mixture of red, orange, yellow, green, blue, violet, and everything in between. White noise is the audio equivalent of white light. It's a combination of all the frequencies that span the entire audible spectrum. White noise is a mixture of all the musical pitches with names and all the pitches in between that aren't named. This is the informal definition of white noise.

Formally, white noise is a sound with a flat frequency spectrum; that is, it transmits power equally at all frequencies. It is a mathematical ideal with a representation something like this …

p(ƒ) = constant

So that …

  ƒmax    
P = 
p(ƒ) dƒ  
 
  ƒmin    

Where …

p(ƒ) =  the value of the power spectral density measured in W/Hz (used for theoretical discussions) or the relative power spectral density in V2/Hz (what normally gets used in practice)
ƒ =  any frequency of sound in the range of human hearing
P =  the total power output of the source (theory) or some quantity proportional to that (practice)

Note …

The human auditory system is logarithmic when it comes to perceiving sounds. This is true when it comes to amplitude (which is related to intensity or loudness and was discussed earlier in this book) and frequency (which will be discussed now). Listen to a sequence of two pure tones with a difference of 100 Hz between them, say 100 Hz and 200 Hz. Then listen to another sequence with the same linear difference but starting on a much higher frequency, say 800 Hz and 900 Hz. The perceived increase in frequency of the second pair seems smaller than the first pair even though the absolute size of the increase is the same.

100 Hz / 200 Hz 800 Hz / 900 Hz

A doubling from 100 Hz to 200 Hz sounds similar to a doubling from 800 Hz to 1600 Hz.

100 Hz / 200 Hz 800 Hz / 1600 Hz

Likewise a one eighth increase from 100 Hz to 112.5 Hz sounds the same as a one eighth increase from 800 Hz to 900 Hz.

100 Hz / 112.5 Hz 800 Hz / 900 Hz

We tend to group sounds perceptually into bands that increase by successive powers of 2. A mathematician would call this a logarithmic increase. Musicians call the bands octaves. Audio technicians developed pink noise as a response to this reality.

Formally, pink noise is a sound with a power spectral density proportional to the inverse of frequency, or 1/ƒ. Because of this it is also known as 1/ƒ noise (one-over-f noise).

p(ƒ) =  constant
ƒ

Pink noise, like white noise, as a model must be limited to some range of frequencies. If not, our source of sound would be radiating infinite power.

           
P = 
k  df = k 
log ƒ
 
ƒ  
  0         0  
P =  k (log ∞ − log 0)  
 
P =  k (∞ − 1) = ∞  
 

A pink noise signal transmits energy equally over all octaves (logarithmic intervals). Let's test this essential characteristic. Compute the power radiated in a frequency band spanning the octave ƒ = (a, 2a) …

  2a         2a  
P = 
k  df = k 
log ƒ
 
ƒ  
  a         a  
P =  k (log 2a − log a)  
 
P = 
k log  2a
a
 
 
P =  k log 2  
 

Compare that to the power radiated in the next octave higher ƒ = (2a, 4a) and you will see it is the same …

  4a         4a  
P = 
k  df = k 
log ƒ
 
ƒ  
  2a         2a  
P =  k (log 4a − log 2a)  
 
P = 
k log  4a
2a
 
 
P =  k log 2  
 

Following the analogy that sound is to frequency as light is to color. Pink noise is pink because it a little bit white, since all audible frequencies are present, and also a little bit red, since it is mostly composed of low frequencies. (Low frequency visible light is seen as red.) Pink noise sounds richer than white noise — like a waterfall or blowing wind. White noise sounds brighter, but also more harsh than pink noise — like steam escaping a radiator or, as a famous biologist once said, like "innumerable mice eating Rice Crispies". You might also like to listen to some applause for a non-mathematical, real world example of noise. You deserve it.

white
noise
pink
noise
concert
applause

consonance & dissonance

Sounds that lend themselves to music are those that sound pleasant (or at least tolerable) when played in sequence (called melody) or in unison (called harmony).

high degrees of consonance (shared harmonics)

dissonance

Consonances are sometimes described as being inherently more pleasant to the ear and dissonances as less pleasant.

Musical notes that are consonant share a large number of overtones. In the just intonation scale the octave is the most consonant interval followed by the perfect fifth, perfect fourth, major third and sixth, and major seventh. No other set of intervals shares such a high degree of consonance.

Notes separated by an octave sound similar — like two people with different voices trying to sing the same note.

overtones line up

Helmholtz used the German word oberton which literally means "upper tone". Someone transliterated the word (instead of translating it) and oberton became overtone in English by accident. If Helmholtz had meant overtone he would have said überton.

[slide]

Stuff

This is where the physics ends and the music theory and abstract algebra (group theory and combinatorics) start to take over.

musical scales

The foundation of music is the musical note: a combination of pitch (the musical word for frequency) and duration (the ordinary word for amount of time).

Did I say music was based on notes? That's not true. Real music is based on intervals (the ratio of any two pitches) with high degrees of consonance (shared harmonics).

A scale is set of pitches (pitch classes, more precisely) arranged in order of increasing frequency from which notes are selected and arranged to create a musical composition. The exact pitches used in any scale are determined by the starting pitch (called the tonic), a set of rules for generating intervals (called a tuning system), and the pattern of intervals selected (sometimes called the mode).

Western music theorists identified eight basic intervals defined by their relative size …

  1. tonic or unison — an interval of one to one, perfect consonance
  2. second
  3. third
  4. fourth
  5. fifth — the third most consonant interval
  6. sixth
  7. seventh
  8. octave — an interval of two to one, half of all overtones match

The intervals of adjacent pitches in a scale often come in two sizes called whole tones and semitones (or half tones). The relationship between these should be obvious — two semitone intervals applied in succession equal a whole tone — but it turns out to be only approximately true in most tuning systems. Combinations of tones and semitones may also be named by size …

Each interval also needs an adjective to describe its quality …

Many, many, many interval patterns are used in musical composition. (That's too many manys.) A convenient way to organize them us by the number of intervals they contain. Here's a list of the interval patterns that will be discussed in this book.

  1. pentatonic
  2. hexatonic
  3. heptatonic
  4. octatonic
  5. dodecatonic

Since notes separated by an octave sound similar, a musical scale can be completely described by a set of intervals with ratios ranging from 1:1 to 2:1. If you need notes above the octave, just double the intervals. If you need notes higher than that, double the intervals again. If you need notes lower than the tonic, take all your intervals and halve them. If you need notes lower than that, halve all the intervals again. You get the idea. This is called octave duplication or circularity of pitch.

Pitches separated by an octave form a pitch class. They are always named using an uppercase letter (A, B, C, D, E, F, G) and sometimes include a modifier symbol called an accidental (♯, ♭, ♮).

symbol name description
sharp raises the pitch by one semitone
flat lowers the pitch by one semitone
natural cancels any previously applied accidentals
Accidentals

Adding a sharp to a pitch is the same as adding a flat to the next higher pitch (usually). F♯ is the same as G♭, for example. This is known as enharmonic equivalence. Similarly, adding a double sharp or a double flat to a lettered pitch changes the pitch by one letter (usually). C♯♯ is the same as D, and D♭♭ is the same as C. I keep saying "usually" because there are exceptions around the notes B♯, C♭, E♯, F♭. A piano keyboard is a familiar way to display all the pitch classes.

Enharmonic equivalence is not a given. There are some tuning systems where the distinction between F♯ and G♭ is real. That's part of the reason we have two names for the same pitch today. Thankfully, most of us will never need to know what that distinction is. I have read explanations, but I have never understood them. I think most formally trained Western musicians today would probably say the same thing. Fans of Renaissance and Baroque music would be the exception.

Want to know whether you should say A♯ or B♭? Just look at the pitch classes of the scale used to create your musical composition. If you say any letter twice, you've probably done it wrong. For example …

F, G, A, B♭, C, D, E

is correct, but …

F, G, A, A♯, C, D, E

… is wrong because the letter A appeared twice. This is not a perfect rule, but it works for all the diatonic scales — the most used scales in Western music. (Non-Western music theory is beyond the scope of this book. The same goes for microtonal music — music that uses intervals smaller than a semitone)

name pitch classes type
A minor diatonic A, B, C, D, E, F, G heptatonic
B♭ major diatonic B♭, C, D, E♭, F, G, A heptatonic
C major diatonic C, D, E, F, G, A, B heptatonic
C♯ string of pearls C♯, D♯, E, F♯, G, A, B♭, C octatonic
D♭ whole tone D, E, F, G, A, B hexatonic
E minor blues E, G, A, C, D pentatonic
F chromatic F, F♯, G, G♯, A, A♯, B, C, C♯, D, D♯, E dodecatonic
A sampling of different scales

just intonation

An interval is the ratio of any two pitches in a scale. Consonance occurs when the overtones of one pitch coincide with the overtones of another. The intervals with the highest degree of consonance are ratios of small whole numbers (like 2, 3, 5) or small powers of these (like 4, 8, 9, 16, 25, 27) or small products of these (like 6, 10, 12, 15, 18, 20). The more prime factors present in a ratio, the less consonant it sounds. A perfect fifth (3/2) is more consonant than a major seventh (15/8 = 3·5/2·2·2) which is much more consonant than a major chromatic semitone (135/128 = 3·3·3·5/2·2·2·2·2·2·2). A tuning system built on these principals is described as having just intonation. The word "just" here implies that the pitches generated are the "right" ones.

Just intonation is a simple way to build a musical scale. It's the natural tuning system people use when they sing together unaccompanied by musical instruments (a cappella). There are many types of scales with just intonation. I will discuss a few of the basic ones. I am showing them on a piano keyboard since it is a familiar instrument, but pianos are not normally tuned this way. (Pianos use equal temperament, which will be described next.)

diatonic scales

The scale shown below is a type of just intonation scale starting on C. Since the third interval (5/4 = 1.25) is "large", the scale is major. Music performed in a major key tends to sound bright, happy, or triumphant. This is true whether the scale was constructed using just intonation or some other scheme. This is probably a function of culture more than anything else. (Happiness isn't something physicists go around measuring — although we do discuss brightness from time to time and in a different context.)

Scales like this one are given the name diatonic — from the Greek phrase δια τονικη (dia tonike), which means "across the tones". The implication is that all the pitches you need are contained in this scale. That's not quite right, however. (More on this later.)

This particular scale is also heptatonic, which means it contains seven intervals: three major whole tones (9/8), two minor whole tones (10/9), and two semitones (16/9). The ratios of adjacent pitches in a major scale follow this order …

  1. major whole tone
  2. minor whole tone
  3. semitone
  4. major whole tone
  5. minor whole tone
  6. major whole tone
  7. semitone

There's no reason I have to start playing a scale on C like the diagrams above show. Why not start playing on A as the diagrams below show? This gets you through an octave with intervals that are still fairly nice. Since the third interval (6/5 = 1.2) of this scale is "small", the scale is said to be minor. Music performed in a minor key tends to sound dark, sad, moody, or introspective. Why this happens is outside the realm of physics. It also contains a particularly harsh interval that is close to a fourth (27/20 ≈ 4/3). Since the fourth is a perfect interval, that must make this an imperfect fourth. It's an interesting example of how being close is not the same thing as being close enough. A perfect fourth sounds pleasant. An imperfect fourth sound harsh.

This scale is also heptatonic and diatonic. The ratios of adjacent pitches in a minor scale follow this order this order …

  1. major whole tone
  2. semitone
  3. major whole tone
  4. minor whole tone
  5. semitone
  6. major whole tone
  7. minor whole tone

In general, a diatonic scale is any scale that can be played on the white keys of a piano. Since there are seven intervals in a diatonic scale, there are seven pitches to start out on on a piano (A, B, C, D, E, F, and G). Starting on C gives you the major diatonic scale. Starting on A gives you a minor diatonic scale. The remaining five scales are used to lesser degrees in Western music past and present — medieval Christian chant being the example most frequently cited. All seven diatonic scales (or modes as they are called) have names corresponding to regions in ancient Greece. The assignment of Greek place name to diatonic mode is completely arbitrary, however. If there ever was any meaning to this naming convention, it's been lost to time.

  aeolian* locrian ionian** dorian phrygian lydian mixolydian
pitch A B C D E F G
tonic 1/1 1/1 1/1 1/1 1/1 1/1 1/1
second 9/8 16/15 9/8 10/9 16/15 9/8 10/9
third 6/5 6/5 5/4 32/27 6/5 5/4 5/4
fourth 27/20 4/3 4/3 4/3 4/3 45/32 4/3
fifth 3/2 64/45 3/2 40/27 3/2 3/2 3/2
sixth 8/5 8/5 5/3 5/3 8/5 27/16 5/3
seventh 9/5 16/9 15/8 16/9 9/5 15/8 16/9
octave 2/1 2/1 2/1 2/1 2/1 2/1 2/1
Diatonic modes and their intervals (just intonation)
* minor scale ** major scale 32/27 small minor third
27/20 imperfect fourth 45/32 augmented fourth 64/45 diminished fifth
40/27 imperfect fifth 27/16 large major sixth 9/5 large minor seventh

chromatic scales

Let's return to the the C major scale. I find that one more interesting because of what happens when you invert the intervals. Instead of moving from the tonic to the pitch, let's move from the pitch to the octave.

To go from C to G on this scale is a jump up of a fifth (multiply by 3/2). From G to the next C is a jump of a fourth (multiply by 4/3). This means the fourth is the inversion of the fifth.

fifth  ×  fourth  =  octave
  3    ×    4    =    2  
  2     3     1  

and the fifth is the inversion of the fourth, which sounds like a statement of the obvious.

fourth  ×  fifth  =  octave
  4    ×    3    =    2  
  3     2     1  

In general, any pair of intervals that together results in a jump of one octave is called an inversion.

interval × inversion = octave

That means a third and a sixth are an inversion.

third  ×  sixth  =  octave
  5    ×    8    =    2  
  4     5     1  

and so are a sixth and a third, which should also be obvious except …

sixth  ×  third   octave
  5    ×    6    =    2  
  3     5     1  

Wait a minute. What just happened? The first first time I wrote the third, I wrote 5/4. The second time I wrote 6/5. Last time I checked, those were different numbers. I did something similar with the sixth (8/5 ≠ 5/3). Welcome to the world of just intonation, where intervals in one direction on a scale do not necessarily equal those in the inverse direction.

The tonic, fourth, fifth and octave are not affected by inversion, which is why they are called perfect intervals. The fourth and fifth are given this as a title — as in, "a perfect fourth" or "a perfect fifth". For some reason no one says "a perfect tonic". Maybe because it sounds like the name of a cocktail. No one says "a perfect octave" either, but I don't have a joke for that one.

The second, third, sixth, and seventh are made smaller by inversion. An interval used to raise a pitch above the tonic is always larger than the same interval used to take the new pitch to the octave. As is tradition, larger intervals are said to be major and smaller ones minor.

interval inversion
tonic 1/1 = 1 octave 2/1 = 2
major second 9/8 = 1.125 minor seventh 16/9 = 1.777…
major third 5/4 = 1.25 minor sixth 8/5 = 1.6
perfect fourth 4/3 = 1.333… perfect fifth 3/2 = 1.5
tritone* ? = ? tritone* ? = ?
perfect fifth 3/2 = 1.5 perfect fourth 4/3 = 1.333…
major sixth 5/3 = 1.666… minor third 6/5 = 1.2
major seventh 15/8 = 1.875 minor second 16/15 = 1.066…
octave 2/1 = 2 tonic 1/1 = 1
Intervals and Inversions (just intonation)
* augmented fourth or diminished fifth diatonic semitone

Something interesting happens when you merge the perfect, major, and minor intervals together into a scale. The minor intervals fill in the "spaces" between pitches separated by a whole tone. (The exception to this is the gap between the perfect fourth and the perfect fifth.) The resulting scale has twelve intervals, each one a semitone away from its neighbor. Such a scale is said to be chromatic — from χρωματος (chromatos), the Greek word for color. Since color is to vision as pitch is to hearing, the metaphor is entirely appropriate. The chromatic scale is more "colorful" than the diatonic scale.

The intervals of the chromatic scale are too numerous to fit nicely on my piano keyboard illustration. Instead, here's a table showing both the intervals with respect to the tonic and the intervals with respect to the preceding pitch in the just intonation chromatic scale.

interval to … tonic … predecessor semitone
tonic 1/1 = 1      
minor second1 16/15 = 1.066… 16/15 = 1.066… diatonic
major second2 9/8 = 1.125 135/128 = 1.054… major
minor third 6/5 = 1.2 16/15 = 1.066… diatonic
major third 5/4 = 1.25 25/24 = 1.041… minor
perfect fourth 4/3 = 1.333… 16/15 = 1.066… diatonic
tritone3 64/45 = 1.422… 16/15 = 1.066… diatonic
perfect fifth 3/2 = 1.5 135/128 = 1.054… major
minor sixth 8/5 = 1.6 16/15 = 1.066… diatonic
major sixth 5/3 = 1.666… 25/24 = 1.041… minor
minor seventh 16/9 = 1.777… 16/15 = 1.066… diatonic
major seventh 15/8 = 1.875 135/128 = 1.054… major
octave 2/1 = 2 16/15 = 1.066… diatonic
Intervals of the chromatic scale (just intonation)
1 diatonic semitone
2 major whole tone
3 augmented fourth or diminished fifth

A word about semitones. There are several of them. An interval of 16/15 is called a diatonic semitone. It separates the major third from the perfect fourth and the major seventh from the octave in the major diatonic scale. When we merged the perfect and major intervals of the diatonic scale with their inversions (the minor intervals) the diatonic semitone popped up in four new locations — tonic to minor second, major second to minor third, perfect fifth to minor sixth, and major sixth to minor second. It's a popular interval. The chromatic scale has two other semitones — a larger or major chromatic semitone equal to 135/128 and a smaller or minor chromatic semitone equal to 25/24.

size interval
25/24 = 1.041… minor chromatic semitone
135/128 = 1.054… major chromatic semitone
16/15 = 1.066… diatonic semitone
10/90 = 1.111… minor whole tone
9/8 = 1.125 major whole tone
Whole tones and semitones (just intonation)

Combinations of chromatic and diatonic semitones make whole tones — majors make majors and minors make minors. In a sense, the chromatic semitones are inversions of the diatonic semitone over the span of a whole tone.

  chromatic
semitone
 ×  diatonic
semitone
 =  whole
tone
minor 25/24  ×  16/15  =  10/9
major 135/128  ×  16/15  =  09/8

Which brings us to the tritone — the bad boy of the chromatic scale, stuck between the beauty and perfection of the fourth and fifth. We know something belongs there (the size of the interval demands it) but what? All the other intervals arose naturally. That should really say "naturally" in scare quotes. How natural is this process? You shall be born by the forced union of the perfect fourth and a semitone … But which one?

  semitone  ×  fourth  =  tritone
minor 25/24  ×  4/3  =  25/18
major 135/128  ×  4/3  =  45/32
diatonic 16/15  ×  4/3  =  64/45

Again I ask … Which one? The perfect fifth and the perfect fourth are inversions of one another. The tritone lies between these, which means it would have to be its own inversion. Do any of these satisfy that recommendation? If so that's a good sign this would be our tritone.

  octave  ÷  tritone  =  inversion
minor 2/1  ×  25/18  =  36/25
major 2/1  ×  45/32  =  64/45
diatonic 2/1  ×  64/45  =  45/32

Well that's no good. Two of the tritones are inversions of each other, which doesn't do us any good, and the inversion of the third one looks like it just gave us another tritone. We're going about this the wrong way. Instead of testing fractions hoping one will work out, maybe we should use algebra and just go to the answer directly. What number, x, equals itself when divided into 2?

2  = x x = √2
x

I didn't expect the square root of two.

Dun, dun, duuuuuunnnnn! Nobody expects the square root of two! Its chief weapon is surprise, fear and surprise; two chief weapons; fear, surprise, and irrationality! Uh, among its chief weapons are: fear, surprise, irrationality, and an almost fanatical inability to be written as a fraction! Uh, I'll come in again...

size type
25/18  = 1.388… minor augmented fourth
45/32  = 1.406… major augmented fourth
√2  = 1.414… ideal tritone1
64/45  = 1.422… minor diminished fifth2
36/25  = 1.44 major diminished fifth
Tritones of all sorts 1 equal tempered tritone
2 just intonation tritone

After playing with these intervals for what seems like days, I think I'll just go with 64/45 — the value closest to the √2 ideal. It seems to play nicely with the other intervals — or at least is doesn't torture them much.

pentatonic scales

Truth is ever to be found in simplicity, & not in ye multiplicity & confusion of things.

Isaac Newton, ca. 1680

If the chromatic scale is all about multiplicity, the pentatonic scale is all about simplicity.

The semitone is one of the least consonant intervals. The chromatic scale is nothing but semitones it seems. The chromatic scale is the mother of all scales as a result. Semitones are the price you pay for that amount of versatility. Diatonic scales are considered simple, but they all have two semitones in them. That's two opportunities for a little bit of dissonance to sneak in. What does the risk averse musician do? If you fear semitones or just don't like them, you could eliminate them. Music composed without semitones would be always mellow and never harsh. Since one of the purposes of music seems to be to "sooth a savage breast", let's do it. Let's just get rid of all the semitones.

Return to the C major diatonic scale again. (My favorite starting point.) Take the diatonic scale and drop the fourth and seventh pitch classes, the ones adjacent to the semitone intervals. This gives you a scale with five pitches — a pentatonic scale.

A pentatonic scale is like a diatonic scale in that it is made up of large and small intervals. In the diatonic scale, the large intervals are whole tones and the small intervals are semitones. In the pentatonic scale, the large intervals are thirds (minor thirds for this example) and the small intervals are whole tones. The ratios of adjacent pitches on this pentatonic scale follow this order …

  1. major whole tone
  2. minor whole tone
  3. minor third
  4. minor whole tone
  5. minor third

This kind of pentatonic scale was built using the white keys of the piano. Starting only on the white keys, there are five allowed pentatonic modes. The modes are identified by the location of the missing intervals (the ones that were semitones in the diatonic scale).

  minor
pentatonic
major
pentatonic
egyptian minor
blues
major
blues
pitch A C D E G
tonic 1/1 1/1 1/1 1/1 1/1
second omit 9/8 10/9 omit 10/9
third 6/5 5/4 omit 6/5 omit
fourth 27/20 omit 4/3 4/3 4/3
fifth 3/2 3/2 40/27 omit 3/2
sixth omit 5/3 omit 8/5 5/3
seventh 9/5 omit 16/9 9/5 omit
octave 2/1 2/1 2/1 2/1 2/1
Pentatonic modes on the white keys (just intonation)
27/20 imperfect fourth 40/27 imperfect fifth 9/5 large minor seventh

The black keys can also be used to make a pentatonic scale.

This gives a slightly different ratio of adjacent pitches and one minor third that's a little bit flatter than normal (32/27 = 1.185 … is a little bit smaller than 6/5 = 1.2). It still has the same overall composition of three whole tones and two minor thirds.

  1. major whole tone
  2. small minor third
  3. major whole tone
  4. minor whole tone
  5. minor third

Starting on the black keys, there are five pentatonic modes. Again, the modes are identified by the locations of the missing intervals.

  minor
blues
major
blues
minor
pentatonic
major
pentatonic
egyptian
pitch A♯ C♯ D♯ F♯ G♯
tonic 1/1 1/1 1/1 1/1 1/1
second omit 9/8 omit 9/8 10/9
third 6/5 omit 32/27 5/4 omit
fourth 27/20 4/3 4/3 omit 4/3
fifth omit 3/2 40/27 3/2 3/2
sixth 8/5 5/3 omit 27/16 omit
seventh 9/5 omit 16/9 omit 16/9
octave 2/1 2/1 2/1 2/1 2/1
Pentatonic modes on the black keys (just intonation)
32/27 small minor third 27/20 imperfect fourth 40/27 imperfect fifth
27/16 large major sixth 9/5 large minor seventh    

On a piano: the black keys are pentatonic, the white keys are diatonic, and all the keys are chromatic.

equal temperament

The chromatic circle is a geometric way to display all the pitch classes. This way is sometimes better for showing the enharmonic relationships. The space between adjacent pitch classes is a semitone.

pythagoras' circle of fifths

Pythagoras of Samos (582–496 BCE) Greece was the first to try and describe music with a mathematical system called the circle of fifths (or cycle of fifths). Start with the tonic. Multiply by a perfect fifth (32). Do it again. This puts you into the next octave. Bring it down an octave (multiply by 12) so you can keep building your scale. Well …

32 × 12 = 34

is the inverse of 43, an interval with a great deal of consonance. When you completely build the scale, the ratio 43 turns out to be the fourth interval in the series of eight that make up an octave. Thus the name fourth. The fifth and the fourth are inversions of one another in an octave. They are the only intervals that work out this way. That makes them special, in my mind, but the adjective that was ascribed to them was perfect. Thus the intervals 43 and 32 are called the perfect fourth and perfect fifth, respectively.

So here's the plan again: start with the tonic, bring it up a perfect fifth, take it down a perfect fourth, and repeat until the ratio equals an octave (21).

[slide]

We'll start on C since that's the middle of the modern piano. Behold!

F: C =  (3/2)−1 (1/2)0  =  2  = 0.6666666666666…
3
C: C =  (3/2)0 (1/2)0  =  1  = 1 ← start here
1
G: C =  (3/2)1 (1/2)0  =  3  = 1.5
2
D: C =  (3/2)2 (1/2)1  =  9  = 1.125
8
A: C =  (3/2)3 (1/2)1  =  27  = 1.6875
16
E: C =  (3/2)4 (1/2)2  =  81  = 1.265625
64
B: C =  (3/2)5 (1/2)2  =  243  = 1.8984375
128
F♯: C =  (3/2)6 (1/2)3  =  729  = 1.423828125
512
C♯: C =  (3/2)7 (1/2)4  =  2187  = 1.06787109375
2048
G♯: C =  (3/2)8 (1/2)4  =  6561  = 1.601806640625
4096
D♯: C =  (3/2)9 (1/2)5  =  19683  = 1.2013549804688…
16384
A♯: C =  (3/2)10 (1/2)5  =  59049  = 1.8020324707031…
32768
E♯: C =  (3/2)11 (1/2)6  =  177147  = 1.3515243530273…
131072
B♯: C =  (3/2)12 (1/2)6  =  531441  = 2.0272865295410…
262144

Oh oh. For those of you familiar with the piano, you will note that the errors occur at notes that do not exist on the keyboard. There is no such thing as a B♯ (a.k.a. A♭) or an E♯ (a.k.a. F♭).

interval pythagorean ratio interval name
C :  C 1 tonic
C♯ :  C 1.06787109375 minor second
D :  C 1.125 major second
D♯ :  C 1.2013549804688… minor third
E :  C 1.265625 major third
F :  C 1.3333333333333… perfect fourth
E♯ :  C 1.3515243530273… bigger than a perfect fourth
F♯ :  C 1.423828125 tritone*
G :  C 1.5 perfect fifth
G♯ :  C 1.601806640625 minor sixth
A :  C 1.6875 major sixth
A♯ :  C 1.8020324707031… minor seventh
B :  C 1.8984375 major seventh
B♯ :  C 2.0272865295410… bigger than an octave
The Pythagorean intervals sorted and named * augmented fourth or diminished fifth

It should really be called the spiral of fifths since it never closes up. One complete lap around the circle should equal a whole number of octaves. That turns out to be 12 fifths and 7 octaves. But 12 fifths is a bit larger than seven octaves. This discrepancy is known as the pythagorean comma and is equal to …

B♯: C =  (3/2)12 (1/2)7  =  531441  = 1.0136432647705…
524288

Thus B♯ is a bit higher than C by about one-quarter of a semitone. This is a small difference that would be audible to trained ears were the two notes to be played in succession. Ordinary folks might not perceive the difference at all. Play them together as a part of a chord and your ears would definitely not enjoy it.

ƒB♯ − ƒC  = ƒbeat
(256 Hz)(1.013643264771 − 1)  = 3.5 Hz

The dissonance would be audible as a 3.5 Hz beat for C = 256 Hz. No musician would ever want to be this far out of tune and no audience would want to listen to them.