Music and Noise

Glenn Elert

Music and Noise

Discussion

definitions

The distinction between music and noise is mathematical form. Music is ordered sound. Noise is disordered sound.

Music and noise are both mixtures of sound waves of different frequencies. The component frequencies of music are discrete (separable) and rational (their ratios form simple fractions) with a discernible dominant frequency. The component frequencies of noise are continuous (every frequency will be present over some range) and random (described by a probability distribution) with no discernible dominant frequency.

music

periodic sound waves

Sound is a longitudinal wave, which means the particles of the medium vibrate parallel to the direction of propagation of the wave. A sound wave coming out of a musical instrument, loudspeaker, or someone's mouth pushes the air forward and backward as the sound propagates outward. This has the effect of squeezing and pulling on the air, changing its pressure only slightly. These pressure variations can be detected by the ear drum (a light flexible membrane) in the middle ear, translated into neural impulses in the inner ear, and sent on to the brain for processing. They can also be detected by the diaphragm of a microphone (a light, flexible membrane), translated into an electrical signal by any one of several electromechanical means, and sent on to a computer for processing. The processing done in the brain is very sophisticated, but the processing done by a computer is relatively simple The pressure variations of a sound wave are changed into voltage variations in the microphone, which are sampled periodically and rapidly by a computer and then saved as numbers.

A graph of microphone voltage vs. time (called a waveform) is a convenient way to use a computer to see sound. Before the rise of ubiquitous digital computers, waveforms were often analyzed electronically using an oscilloscope — a cathode ray tube with an electron beam that traced voltage as a function of time on a fluorescent glass screen. Oscilloscopes are basically simplified televisions with one purpose (to a draw time series or parametric graph) and one color (usually bright green). This task is easily mimicked by 21st century desktop, laptop, and tablet computers as well as smart phones. Oscilloscope applications on these devices often pay homage to their analog ancestors by using a green color scheme.

The simplest sound to analyze mathematically is the pure tone — one where the pressure variation is described by a single frequency. A pure tone would look like a sine curve when graphed oscilloscope style.

y = A sin 2πft

where…

y =	the instantaneous value of the microphone voltage (V), which is directly proportional to the variation in air pressure (∆P) due to sound waves impacting the microphone
A =	the amplitude, or maximum value of the waveform
2π =	a constant needed to get the units to work out
f =	the frequency of the pure tone
t =	time, of course

Oscilloscope traces of pure tones
040 Hz pure tone	100 Hz pure tone	315 Hz pure tone

harmonics

Music in its simplest form is monotonic; that is, composed only of pure tones. Monotonic music is dull and lifeless like a 1990s ringtone (worse than that even); like a 1970s digital watch alarm (now we're talking); like an oscillating circuit attached to a speaker built by a college student in an introductory physics class (so primitive). Real music, however, is polytonic — a mixture of pure tones played together in a manner that sounds harmonious. A sound composed of multiple frequencies like that produced by a musical instrument or the human voice would still be periodic, but would be more complex than just a simple sine curve.

Oscilloscope traces for various instruments
Bass voice singing	Tenor voice singing	Soprano voice singing
Pipe organ	Trumpet	Violin
Conga drum	High hat cymbal	Woodblock

The human voice and musical instruments produce sounds by vibration. What vibrates determines the type of instrument.

Classification of musical instruments
category	vibrating part	examples
idiophone	whole instrument	bell, cymbal, musical saw, wood block, xylophone
membranophone	stretched membrane	drums, kazoo, human voice
chordophone	stretched string	strings (violin, guitar, harp), piano
aerophone	air column	woodwinds (saxophone, flute), brass (trumpet, tuba), organ
electrophone	electric circuit	synthesizer, theremin

Like many other mechanical systems, musical instruments vibrate naturally at several related frequencies called harmonics. The lowest frequency of vibration, which is also usually the loudest, is called the fundamental. The higher frequency harmonics are called overtones. The human auditory system perceives the fundamental frequency of a musical note as the characteristic pitch of that note. The amplitudes of the overtones relative to the fundamental give the note its quality or timbre — pronounced in English as tæmbər or in quasi-French by English speakers as tɛ̃br with a nasal ɛ̃ for the medial e and silence for the final e. Timbre is one of the features of sound that enables us to distinguish a flute from a violin and a tuba from a timpani.

synthesis and analysis

Recall that when waves meet they don't collide like material objects, they pass through each other like spectres — they interfere. Interfering waves combine by the principle of linear superposition — basically, just add the values of one function to the values of another function everywhere in mathematical space. With the right combination of sine and/or cosine functions, you can make functions with all kinds of shapes (as long as they're functions). Several examples were illustrated earlier in this book. Here they are again…

square wave

y = ∑	1	sin(2n − 1)x
	2n − 1

sawtooth wave

y = ∑	(− 1)^n +1	sin nx
	n

triangle wave

y = ∑	1	cos(2n − 1)x
	(2n − 1)²

The act of combining pure tones together to produce a complex waveform is called additive synthesis. Non-electronic instruments do this naturally. Electronic instruments (if they want to be taken seriously) are designed with additive synthesis in mind. The analog synthesizer is a good example of this.

Generate a sinusoidal electric signal of with an oscillator — a fairly simple electric circuit composed of a capacitor (C) and an inductor (L), also known as an LC circuit. Using a second oscillator generate a second signal at a different frequency. The second harmonic would be a wise choice. Add a third, fourth, fifth, sixth and seventh oscillator for the corresponding higher harmonics. Synchronize them electronically so that when the pitch of the fundamental oscillator changes, the frequencies of the overtone oscillators follow along. Adjust the relative amplitude of each oscillator to produce a sound with the desired timbre. Attach the circuit to an amplifier and a loudspeaker so it can be heard. Attach the fundamental oscillator to a control device that turns it on and off and changes its frequency as needed. Make your control device look like a piano keyboard so musicians will use your device as a musical instrument. Call it a synthesizer since it uses the principles of additive synthesis. Let the Seventies begin.

It is mathematically possible to transform any waveform from a continuous sequence of values on a time-series graph into an infinite series of discrete sine and cosine functions. The process is called spectral analysis or Fourier analysis after its inventor, the French mathematician and physicist, Joseph Fourier (1768–1830). Taking a sound apart through spectral analysis is the inverse process of putting a sound together through additive synthesis.

Start with an arbitrary periodic function and assume that it was made from the linear superposition of an infinite number of sine and cosine functions in a harmonic series. What coefficients should each harmonic be multiplied by to give the desired function? The process for answering this question is called a Fourier transform and it is best left to a mathematician to explain.

Any periodic function f(t) with a period of T can be written like this…

f(t) = a₀ + ∑a_ncos	⎛ ⎜ ⎝	2πnt	⎞ ⎟ ⎠	+ ∑b_nsin	⎛ ⎜ ⎝	2πnt	⎞ ⎟ ⎠
		T				T

where…

		T
a₀ =	1	⌠ ⎮ ⌡	f(t) dt
	T
		0
		T
a_n =	2	⌠ ⎮ ⌡	f(t)cos	⎛ ⎜ ⎝	2πnt	⎞ ⎟ ⎠	dt
	T				T
		0
		T
b_n =	2	⌠ ⎮ ⌡	f(t)sin	⎛ ⎜ ⎝	2πnt	⎞ ⎟ ⎠	dt
	T				T
		0

and n = 1, 2, 3, … and that's all I want to say about that.

Fortunately for us, relatively simple algorithms for solving this problem were discovered soon after Fourier invented the technique, and computer scientists have been coding and sharing fast Fourier transforms since the 1960s. Audio applications with FFT are easily found online for many computing platforms — including smart phones, which means that sound spectral analysis is now literally with everyone's grasp.

Relative strength vs. frequency oscilloscope trace showing seven evenly spaced peaks

The graph above shows the spectral analysis of a female voice. Sounds of all sorts of different frequencies are being produced — some more intensely than others. It appears that this particular voice was strongest at 270 Hz (C♯₄ for those of you who know music) and multiples of 270 Hz (540 Hz, 810 Hz, 1080 Hz, 1350 Hz, 1620 Hz, and 1890 Hz). This graph has been normalized, which exaggerated the peaks at the higher frequencies so they could be seen. I like this graph because the spacing between peaks is just so perfect. This kind of structure is what makes a sound musical. Let's look at some more examples.

A flute is essentially a tube that is open at both ends. Air is blown across one end and sound comes out the other. A spectral analysis confirms this. The harmonics are all whole number multiples of the fundamental frequency (436 Hz, a slightly flat A₄ — a bit lower in frequency than is normally acceptable). Note how the second harmonic is nearly as intense as the fundamental. This strong second harmonic is part of what makes a flute sound like a flute.

A recorder is also a tube with two open ends. It produces a sound similar to a flute, but not exactly the same. Again the harmonics are whole number multiples of the fundamental frequency (923 Hz, a very sharp A₅ — much higher in frequency than is normally acceptable), but for some reason the second harmonic is nearly non existent. This nearly missing second harmonic is part of what makes a recorder sound like a recorder and not sound like a flute.

A tuning fork is forked; that is to say, it splits from its handle into two branches called tines. Each tine is fixed to the handle at one end, but is free to vibrate at the other. As a result, one would expect to find only those harmonics that were odd multiples of the fundamental in the spectrum of a tuning fork. This is what the spectral analysis shows. The even harmonics are present, but are they are weak and are probably due to the sympathetic vibrations of something nearby. This spectra was produced by striking a large, demonstration-size tuning fork (not the one pictured above) with an excessively heavy blow. Tuning forks should always be tapped lightly and on a resilient surface. Doing so reduces the intensity of the "ping" overtones, which is a desirable thing. An ideal tuning fork would vibrate at just one frequency. The tuning fork used in this experiment was rated at 256 Hz, but the spectral analysis software picked up 259 Hz for the fundamental. Is the tuning fork out of tune or is the software in error?

noise

the basics

Music is sound with a discrete structure. Noise is sound with a continuous structure. Music is composed of sounds with a fundamental frequency and overtones. Noise is composed of sounds with frequencies that range continuously in value from as low as you can hear to as high as you can hear — not necessarily at equal intensity, however. Music is described mathematically by an infinite sum of sines and cosines multiplied by appropriately valued coefficients — infinite mathematically, but in practice only a handful of overtones really matter. Noise is described by a spectral power distribution (or power spectrum), much like the statistical distributions of kinetic molecular theory. Music is ordered. Noise is random.

Noise is what you hear when you tune an analog radio or television to an empty frequency. It's the overall sound of rain falling on leaves, soda bubbling in a glass, air escaping from a tire, or a crowd applauding.

"Noisy" describes some voiceless consonants used in English better than "musical".

English consonants with noisy structure
sound	ipa*	example	name
/s/	[s]	sin	voiceless alveolar sibilant
/sh/	[ʃ]	shin	voiceless palato-alveolar sibilant
/f/	[f]	fin	voiceless labiodental fricative
/th/	[θ]	thin	voiceless dental fricative
/h/	[h]	hello	voiceless glottal fricative
/hw/	[ʍ]	wheat	voiceless labiovelar approximant^†

Noise might not be periodic, but it can still be analyzed with a fast Fourier transform — as long as we only examine a finite amount of it. Since infinite quantities don't exist in the real world, this isn't a problem. Mathematics allows for more possibilities than physicists will never encounter.

Spectral analysis of noise. Frequency on the horizontal axis. Intensity on the vertical axis.
White noise	Pink noise	Crowd cheer

white noise

Noise might not be ordered, but that doesn't mean it can't be described. Frequency is to sound as color is to light. We hear different frequencies of sound as different pitches (A, B, C, D, E, F, G; for example). We see different frequencies of light as different colors (red, orange, yellow, green, blue, and violet; for example). The analogy is not perfect. (What analogy is?) The notes of the musical scale repeat themselves for every doubling of frequency (a topic I'll come back to). The frequencies of visible light span a range that is so narrow, they never get a chance to double. The frequencies of the notes of a musical scale are related by simple numerical fractions. The frequency bands associated with specific color names show no mathematical relationships. (This analogy isn't looking very promising.)

In much the same way that most sounds we hear are polytonic (composed of many frequencies), most light we see is polychromatic (composed of many frequencies). Sources of light that are hot enough (like the Sun) will emit a mixture of frequencies that span the entire visible spectrum. The color of this light is white — a mixture of red, orange, yellow, green, blue, violet, and everything in between. White noise is the audio equivalent of white light. It's a combination of all the frequencies that span the entire audible spectrum. White noise is a mixture of all the musical pitches with names and all the pitches in between that aren't named. This is the informal definition of white noise.

Formally, white noise is a sound with a flat power spectrum; that is, it transmits power equally at all frequencies. It is a mathematical ideal with a representation something like this…

p(f) = constant

So that…

	fmax
P =	⌠ ⌡	p(f) df = pΔf

	fmin

where…

p(f) =	the value of the power spectral density measured in W/Hz (used for theoretical discussions) or the relative power spectral density in V²/Hz (what normally gets used in practice)
f =	any frequency of sound in the range of human hearing
P =	the total power output of the source (theory) or some quantity proportional to that (practice)

Note…

Spectral analysis applications process the voltage of the signal coming from a microphone, not the power of the sound detected by the microphone. Electric power is proportional to the square of voltage at constant resistance, so V²/Hz are measured instead of W/Hz. Both quantities have the same mathematical structure, which is the important thing in audio analysis.
Frequencies are limited to a definite range. This keeps the integral from running away to infinity. A rectangle of any height that is infinitely wide would contain infinite area. Since we can't let this happen, we have to assign limits to the spectrum. This is something that makes the white noise distribution unrealistic. Real statistical distributions tend to be curves that taper off nicely to zero. They don't start and end abruptly.
White noise is a simple mathematical model that is so simple, it's unrealistic. It is a good way to introduce the topic of power spectral density, however. It is also a convenient reference sound for testing audio equipment.

pink noise

The human auditory system is logarithmic when it comes to perceiving sounds. This is true when it comes to amplitude (which is related to intensity or loudness and was discussed earlier in this book) and frequency (which will be discussed now). Listen to a sequence of two pure tones with a difference of 100 Hz between them, say 100 Hz and 200 Hz. Then listen to another sequence with the same linear difference but starting on a much higher frequency, say 800 Hz and 900 Hz. The perceived increase in frequency of the second pair seems smaller than the first pair even though the absolute size of the increase is the same.

100 Hz, 200 Hz

800 Hz, 900 Hz

A doubling from 100 Hz to 200 Hz sounds similar to a doubling from 800 Hz to 1600 Hz.

100 Hz, 200 Hz

800 Hz, 1600 Hz

Likewise a one eighth increase from 100 Hz to 112.5 Hz sounds the same as a one eighth increase from 800 Hz to 900 Hz.

100 Hz, 112.5 Hz

800 Hz, 900 Hz

We tend to group sounds perceptually into bands that increase by successive powers of 2. A mathematician would call this a logarithmic increase. Musicians call the bands octaves. Audio technicians developed pink noise as a response to this reality.

Formally, pink noise is a sound with a power spectral density that is inversely proportional to frequency over the range of audible frequencies. Because of this it is also known as 1/f noise (one-over-f noise). We can write this as a proportionality statement…

p(f) ∝	1
	f

or as an equation with a constant of porportionality (k)…

p(f) =	k
	f

Following the analogy that sound is to frequency as light is to color, pink noise is pink because it's a little bit white and a little bit red — white since all frequencies are present, red since the lower frequencies are the louder ones. (Low frequency light appears red.)

Like white noise, a pink noise source cannot span an infinite range of frequencies. If it did, it would be radiating infinite power.

	∞					∞
P =	⌠ ⎮ ⌡	k	df = k	⎡ ⎢ ⎣	log f	⎤ ⎥ ⎦
P =	⌠ ⎮ ⌡	f	df = k	⎡ ⎢ ⎣	log f	⎤ ⎥ ⎦
	0					0
P =	k (log ∞ − log 0)
P =	k (log ∞ − log 0)
P =	k (∞ − 1) = ∞
P =	k (∞ − 1) = ∞

A pink noise signal transmits energy equally over all octaves (logarithmic intervals). Let's test this essential characteristic. Compute the power radiated in a frequency band spanning the octave f = (a, 2a).

2a

P =

⌠
⎮
⌡

k

df = k

⎡
⎢
⎣

log f

⎤
⎥
⎦

f

a

P =

k (log 2a − log a)

P =

k log	2a
	a

P =

k log 2

Compare that to the power radiated in the next octave higher f = (2a, 4a).

4a

P =

⌠
⎮
⌡

k

df = k

⎡
⎢
⎣

log f

⎤
⎥
⎦

f

2a

P =

k (log 4a − log 2a)

P =

k log	4a
	2a

P =

k log 2

audio samples

Listen to the audio samples below to hear what white and pink noise sound like. Pink noise sounds richer than white noise — like a waterfall or blowing wind. White noise sounds brighter, but also more harsh than pink noise — like steam escaping a radiator or, as a famous biologist once said, like "innumerable mice eating Rice Krispies". A cheering crowd has it's own unique power spectrum. It's neither white nor pink.

white noise

pink noise

crowd cheer

consonance & dissonance

FIX THIS PART

Sounds that lend themselves to music are those that sound pleasant (or at least tolerable) when played in sequence (called melody) or in unison (called harmony).

high degrees of consonance (shared harmonics)

dissonance

Consonances are sometimes described as being inherently more pleasant to the ear and dissonances as less pleasant.

Musical notes that are consonant share a large number of overtones. In the just intonation scale the octave is the most consonant interval followed by the perfect fifth, perfect fourth, major third and sixth, and major seventh. No other set of intervals shares such a high degree of consonance.

Notes separated by an octave sound similar — like two people with different voices trying to sing the same note.

[paraphrase] The octave is the most perfect consonance, so perfect that it gives the impression of duplicating the original tone, a phenomenon for which no convincing explanation has ever been found. [paraphrase]

overtones line up

Helmholtz used the German word oberton which literally means "upper tone". Someone transliterated the word instead of translating it and oberton became overtone in English by accident. If Helmholtz had meant overtone he would have said überton.

Stuff

pure tones
complex tone
partials
- fundamental
- overtones
  - harmonic
  - inharmonic
  - evil 7th overtone
  - who cares after the 8th overtone

This is where the physics ends and the music theory and abstract algebra (group theory and combinatorics) start to take over.

musical scales

intervals

The foundation of music is the musical note: a combination of pitch (the musical word for frequency) and duration (the ordinary word for amount of time).

Did I say music was based on notes? That's not true. Real music is based on intervals (the ratio of any two pitches) with high degrees of consonance (shared harmonics).

A scale is set of pitches (pitch classes, more precisely) arranged in order of increasing frequency from which notes are selected and arranged to create a musical composition. The exact pitches used in any scale are determined by the starting pitch (called the tonic), a set of rules for generating intervals (called a tuning system), and the pattern of intervals selected (sometimes called the mode).

Western music theorists identified eight basic intervals defined by their relative size…

tonic or unison — an interval of one to one, perfect consonance
second
third
fourth
fifth — the third most consonant interval
sixth
seventh
octave — an interval of two to one, half of all overtones match

The intervals of adjacent pitches in a scale often come in two sizes called whole tones and semitones (or half tones). The relationship between these should be obvious — two semitone intervals applied in succession equal a whole tone — but it turns out to be only approximately true in most tuning systems. Combinations of tones and semitones may also be named by size…

semitone — typically the smallest interval
whole tone — two semitones
ditone — two whole tones (a term that's rarely used)
tritone — three whole tones, half an octave
6 whole tones or 12 semitones equal one octave.

Each interval also needs an adjective to describe its quality…

perfect — an interval that is an inversion of another basic interval
major — the larger of two nearly equal intervals
minor — the smaller of two nearly equal intervals
augmented — a semitone higher than a major or perfect interval
diminished — a semitone lower than a minor or perfect interval

Many, many, many interval patterns are used in musical composition. (That's too many manys.) A convenient way to organize them us by the number of intervals they contain. Here's a list of the interval patterns that will be discussed in this book in numerical order (but not in the order in which they will be discussed).

pentatonic
- blues
- egyptian
hexatonic
- whole tone
heptatonic
- diatonic
octatonic
- string of pearls
dodecatonic
- chromatic

pitches

Since notes separated by an octave sound similar, a musical scale can be completely described by a set of intervals with ratios ranging from 1:1 to 2:1. If you need notes above the octave, just double the intervals. If you need notes higher than that, double the intervals again. If you need notes lower than the tonic, take all your intervals and halve them. If you need notes lower than that, halve all the intervals again. You get the idea. This is called octave duplication or circularity of pitch.

Pitches separated by an octave form a pitch class. They are named using an uppercase letter (A, B, C, D, E, F, G) and sometimes include a modifier symbol called an accidental (♯, ♭, ♮).

Accidentals
symbol	name	description
♯	sharp	raises the pitch by one semitone
♭	flat	lowers the pitch by one semitone
♮	natural	cancels any previously applied accidentals

Adding a sharp to a pitch is the same as adding a flat to the next higher pitch (usually). F♯ is the same as G♭, for example. This is known as enharmonic equivalence. Similarly, adding a double sharp or a double flat to a lettered pitch changes the pitch by one letter (usually). C♯♯ is the same as D, and D♭♭ is the same as C. I keep saying "usually" because there are exceptions around the notes B♯, C♭, E♯, F♭. A piano keyboard is a familiar way to display all the pitch classes in a scale.

Enharmonic equivalence is not a given. There are some tuning systems where the distinction between F♯ and G♭ is real. That's part of the reason we have two names for the same pitch today. Thankfully, most of us will never need to know what that distinction is. I have read explanations, but I have never understood them. I think most formally trained Western musicians today would probably say the same thing. Fans of Renaissance and Baroque music would be the exception.

Want to know whether you should say A♯ or B♭? Just look at the pitch classes of the scale used to create your musical composition. If you say any letter twice, you've probably done it wrong. For example…

F, G, A, B♭, C, D, E

is correct, but…

F, G, A, A♯, C, D, E

is wrong because the letter A appeared twice. This is not a perfect rule, but it works for all the diatonic scales — the most used scales in Western music. (Non-Western music theory is beyond the scope of this book. The same goes for microtonal music — music that uses intervals smaller than a semitone.)

A sampling of different scales
name	pitch classes	type
A minor diatonic	A, B, C, D, E, F, G	heptatonic
B♭ major diatonic	B♭, C, D, E♭, F, G, A	heptatonic
C major diatonic	C, D, E, F, G, A, B	heptatonic
C♯ string of pearls	C♯, D♯, E, F♯, G, A, B♭, C	octatonic
D♭ whole tone	D♭, E♭, F, G, A, B	hexatonic
E minor blues	E, G, A, C, D	pentatonic
F chromatic	F, F♯, G, G♯, A, A♯, B, C, C♯, D, D♯, E	dodecatonic

just intonation

definition

An interval is the ratio of any two pitches in a scale. Consonance occurs when the overtones of one pitch coincide with the overtones of another. The intervals with the highest degree of consonance are ratios of small whole numbers (like 2, 3, 5) or small powers of these (like 4, 8, 9, 16, 25, 27) or small products of these (like 6, 10, 12, 15, 18, 20). The more prime factors present in a ratio, the less consonant it sounds. A perfect fifth with two prime factors (³₂) is more consonant than a major seventh with five (¹⁵₈ = ^3·5_2·2·2) which is much more consonant than a major chromatic semitone with eleven (¹³⁵₁₂₈ = ^3·3·3·5_{2·2·2·2·2·2·2}). A tuning system built on these principles is described as having just intonation. The word "just" here implies that the pitches generated are the "right" ones.

Just intonation is a simple way to build a musical scale. It's the natural tuning system people use when they sing together unaccompanied by musical instruments (a cappella). There are many types of scales with just intonation. I will discuss a few of the basic ones. I am showing them on a piano keyboard since it is a familiar instrument, but pianos are not normally tuned this way. (Pianos use equal temperament, which will be described next.)

diatonic scales

The scale shown below is a type of just intonation scale starting on C. Since the third interval (⁵₄ = 1.25) is "large", the scale is major. Music performed in a major key tends to sound bright, happy, or triumphant. This is true whether the scale was constructed using just intonation or some other scheme. This is probably a function of culture more than anything else. (Happiness isn't something physicists go around measuring — although we do discuss brightness from time to time and in a different context.)

Scales like this one are given the name diatonic — from the Greek phrase δια τονικη (dia tonike), which means "across the tones". The implication is that all the pitches you need are contained in this scale. That's not quite right, however. (More on this later.)

This particular scale is also heptatonic, which means it contains seven intervals: three major whole tones (⁹₈), two minor whole tones (¹⁰₉), and two semitones (¹⁶₉). The ratios of adjacent pitches in a major scale follow this order…

major whole tone
minor whole tone
semitone
major whole tone
minor whole tone
major whole tone
semitone

diatonic modes

There's no reason I have to start playing a scale on C like the diagram above shows. Why not start playing on A as the diagram below shows? This gets you through an octave with intervals that are still fairly nice. Since the third interval (⁶₅ = 1.2) of this scale is "small", the scale is said to be minor. Music performed in a minor key tends to sound dark, sad, moody, or introspective. Why this happens is outside the realm of physics. It also contains a particularly harsh interval that is close to a fourth (²⁷₂₀ ≈ ⁴₃). Since the fourth is a perfect interval, that must make this an imperfect fourth. It's an interesting example of how being close is not the same thing as being close enough. A perfect fourth sounds pleasant. An imperfect fourth sound harsh.

This scale is also heptatonic and diatonic. The ratios of adjacent pitches in a minor scale follow this order this order…

major whole tone
semitone
major whole tone
minor whole tone
semitone
major whole tone
minor whole tone

In general, a diatonic scale is any scale that can be played on the white keys of a piano. Since there are seven intervals in a diatonic scale, there are seven pitches to start out on on a piano (A, B, C, D, E, F, and G). Starting on C gives you the major diatonic scale. Starting on A gives you a minor diatonic scale. The remaining five scales are used to lesser degrees in Western music past and present — medieval Christian chant being the example most frequently cited. All seven diatonic scales (or modes as they are called) have names corresponding to regions in ancient Greece. The assignment of Greek place name to diatonic mode is completely arbitrary, however. If there ever was any meaning to this naming convention, it's been lost to time.

Diatonic modes and their intervals with respect to the previous note (just intonation)
	ionian*	dorian	phrygian	lydian	mixolydian	aeolian^†	locrian
pitch	C	D	E	F	G	A	B
C	¹₁
D	⁹₈	¹₁
E	¹⁰₉	¹⁰₉	¹₁
F	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅	¹₁
G	⁹₈	⁹₈	⁹₈	⁹₈	¹₁
A	¹⁰₉	¹⁰₉	¹⁰₉	¹⁰₉	¹⁰₉	¹₁
B	⁹₈	⁹₈	⁹₈	⁹₈	⁹₈	⁹₈	¹₁
C	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅
D		⁹₈	⁹₈	⁹₈	⁹₈	⁹₈	⁹₈
E			¹⁰₉	¹⁰₉	¹⁰₉	¹⁰₉	¹⁰₉
F				¹⁶₁₅	¹⁶₁₅	¹⁶₁₅	¹⁶₁₅
G					⁹₈	⁹₈	⁹₈
A						¹⁰₉	¹⁰₉
C							⁹₈

Diatonic modes and their intervals with respect to the tonic (just intonation)
	ionian*	dorian	phrygian	lydian	mixolydian	aeolian^†	locrian
pitch	C	D	E	F	G	A	B
tonic	¹₁	¹₁	¹₁	¹₁	¹₁	¹₁	¹₁
second	⁹₈	¹⁰₉	¹⁶₁₅	⁹₈	¹⁰₉	⁹₈	¹⁶₁₅
third	⁵₄	³²₂₇	⁶₅	⁵₄	⁵₄	⁶₅	⁶₅
fourth	⁴₃	⁴₃	⁴₃	⁴⁵₃₂	⁴₃	²⁷₂₀	⁴₃
fifth	³₂	⁴⁰₂₇	³₂	³₂	³₂	³₂	⁶⁴₄₅
sixth	⁵₃	⁵₃	⁸₅	²⁷₁₆	⁵₃	⁸₅	⁸₅
seventh	¹⁵₈	¹⁶₉	⁹₅	¹⁵₈	¹⁶₉	⁹₅	¹⁶₉
octave	²₁	²₁	²₁	²₁	²₁	²₁	²₁

inversion

Let's return to the the C major scale. I find that one more interesting because of what happens when you invert the intervals. Instead of moving from the tonic to the pitch, let's move from the pitch to the octave.

To go from C to G on this scale is a jump up of a fifth (multiply by ³₂). From G to the next C is a jump of a fourth (multiply by ⁴₃). This means the fourth is the inversion of the fifth.

fifth	×	fourth	=	octave
³₂	×	⁴₃	=	²₁

and the fifth is the inversion of the fourth, which sounds like a statement of the obvious.

fourth	×	fifth	=	octave
⁴₃	×	³₂	=	²₁

In general, any pair of intervals that together results in a jump of one octave is called an inversion.

interval × inversion = octave

That means a third and a sixth are an inversion.

third	×	sixth	=	octave
⁵₄	×	⁸₅	=	²₁

and so are a sixth and a third, which should also be obvious except…

sixth	×	third	=	octave
⁵₃	×	⁶₅	=	²₁

Wait a minute. What just happened? The first first time I wrote the third, I wrote ⁵₄. The second time I wrote ⁶₅. Last time I checked, those were different numbers. I did something similar with the sixth (⁸₅ ≠ ⁵₃). Welcome to the world of just intonation, where intervals in one direction on a scale do not necessarily equal those in the inverse direction.

The tonic, fourth, fifth and octave are not affected by inversion, which is why they are called perfect intervals. The fourth and fifth are given this as a title — as in, "a perfect fourth" or "a perfect fifth". For some reason no one says "a perfect tonic". Maybe because it sounds like the name of a cocktail. No one says "a perfect octave" either, but I don't have a joke for that one.

The second, third, sixth, and seventh are made smaller by inversion. An interval used to raise a pitch above the tonic is always larger than the same interval used to take the new pitch to the octave. As is tradition, larger intervals are said to be major and smaller ones minor.

Intervals and inversions (just intonation)
interval			inversion
tonic	¹₁	= 1	octave	²₁	= 2
major second	⁹₈	= 1.125	minor seventh	¹⁶₉	= 1.777…
major third	⁵₄	= 1.25	minor sixth	⁸₅	= 1.6
perfect fourth	⁴₃	= 1.333…	perfect fifth	³₂	= 1.5
tritone*	⁴⁵₃₂	= 1.40625	tritone**	⁶⁴₄₅	= 1.422…
perfect fifth	³₂	= 1.5	perfect fourth	⁴₃	= 1.333…
major sixth	⁵₃	= 1.666…	minor third	⁶₅	= 1.2
major seventh	¹⁵₈	= 1.875	minor second^†	¹⁶₁₅	= 1.066…
octave	²₁	= 2	tonic	¹₁	= 1

chromatic scales

Something interesting happens when you merge the perfect, major, and minor intervals together into a scale. The minor intervals fill in the "spaces" between pitches separated by a whole tone. (The exception to this is the gap between the perfect fourth and the perfect fifth.) The resulting scale has twelve intervals, each one a semitone away from its neighbor. Such a scale is said to be chromatic — from χρωματος (chromatos), the Greek word for color. Since color is to vision as pitch is to hearing, the metaphor is entirely appropriate. The chromatic scale is more "colorful" than the diatonic scale.

The intervals of the chromatic scale are too numerous to fit nicely on my piano keyboard illustration. Instead, here's a table showing both the intervals with respect to the tonic and the intervals with respect to the preceding pitch in the just intonation chromatic scale.

Intervals of the chromatic scale (just intonation)
interval to	… tonic		… predecessor		semitone
tonic	¹₁	= 1
minor second¹	¹⁶₁₅	= 1.066…	¹⁶₁₅	= 1.066…	diatonic
major second²	⁹₈	= 1.125	¹³⁵₁₂₈	= 1.054…	major
minor third	⁶₅	= 1.2	¹⁶₁₅	= 1.066…	diatonic
major third	⁵₄	= 1.25	²⁵₂₄	= 1.041…	minor
perfect fourth	⁴₃	= 1.333…	¹⁶₁₅	= 1.066…	diatonic
tritone³	⁶⁴₄₅	= 1.422…	¹⁶₁₅	= 1.066…	diatonic
perfect fifth	³₂	= 1.5	¹³⁵₁₂₈	= 1.054…	major
minor sixth	⁸₅	= 1.6	¹⁶₁₅	= 1.066…	diatonic
major sixth	⁵₃	= 1.666…	²⁵₂₄	= 1.041…	minor
minor seventh	¹⁶₉	= 1.777…	¹⁶₁₅	= 1.066…	diatonic
major seventh	¹⁵₈	= 1.875	¹³⁵₁₂₈	= 1.054…	major
octave	²₁	= 2	¹⁶₁₅	= 1.066…	diatonic

The chromatic circle is a geometric way to display all the pitch classes. This way is sometimes better for showing the enharmonic relationships. The space between adjacent pitch classes is a semitone.

semitones, whole tones, tritones

And now for a word about semitones. There are several of them in scales built on just intonation.

An interval of ¹⁶₁₅ is called a diatonic semitone. It separates the major third from the perfect fourth and the major seventh from the octave in the major diatonic scale. When we merged the perfect and major intervals of the diatonic scale with their inversions (the minor intervals) the diatonic semitone popped up in four new locations — tonic to minor second, major second to minor third, perfect fifth to minor sixth, and major sixth to minor second. It's a popular interval. The chromatic scale has two other semitones — a larger or major chromatic semitone equal to ¹³⁵₁₂₈ and a smaller or minor chromatic semitone equal to ²⁵₂₄.

Semitones and whole tones (just intonation)
size		interval
²⁵₂₄	= 1.041…	minor chromatic semitone
¹³⁵₁₂₈	= 1.054…	major chromatic semitone
¹⁶₁₅	= 1.066…	diatonic semitone
¹⁰₉	= 1.111…	minor whole tone
⁹₈	= 1.125	major whole tone

Combinations of chromatic and diatonic semitones make whole tones — majors make majors and minors make minors. In a sense, the chromatic semitones are inversions of the diatonic semitone over the span of a whole tone.

	chromatic semitone	×	diatonic semitone	=	whole tone
minor	²⁵₂₄	×	¹⁶₁₅	=	¹⁰₉
major	¹³⁵₁₂₈	×	¹⁶₁₅	=	⁹₈

Which brings us to the tritone — the bad boy of the chromatic scale, stuck between the perfection of the fourth and fifth. We know something belongs there (the size of the interval demands it) but what? All the other intervals arose naturally. That should really say "naturally" in scare quotes. How natural is this process? You shall be born by the forced union of the perfect fourth and a semitone… But which one?

	semitone	×	fourth	=	tritone
minor	²⁵₂₄	×	⁴₃	=	²⁵₁₈
major	¹³⁵₁₂₈	×	⁴₃	=	⁴⁵₃₂
diatonic	¹⁶₁₅	×	⁴₃	=	⁶⁴₄₅

Again I ask… Which one? The perfect fifth and the perfect fourth are inversions of one another. The tritone lies between these, which means it would have to be its own inversion. Do any of these satisfy that recommendation? If so that's a good sign this would be our tritone.

	octave	÷	tritone	=	inversion
minor	²₁	÷	²⁵₁₈	=	³⁶₂₅
major	²₁	÷	⁴⁵₃₂	=	⁶⁴₄₅
diatonic	²₁	÷	⁶⁴₄₅	=	⁴⁵₃₂

Well that's no good. Two of the tritones are inversions of each other, and the inversion of the third one looks like it just gave us another tritone. We're going about this the wrong way. Instead of testing fractions hoping one will work out, maybe we should use algebra and just go to the answer directly. What number, x, equals itself when divided into 2?

2	= x	⇒	x = √2
x

I didn't expect the square root of two.

Dun, dun, duuuuuunnnnn! Nobody expects the square root of two! Its chief weapon is surprise, fear and surprise; two chief weapons; fear, surprise, and irrationality! Uh, among its chief weapons are: fear, surprise, irrationality, and an almost fanatical inability to be written as a fraction! Uh, I'll come in again...

Tritones of all sorts
size		type
²⁵₁₈	= 1.388…	minor augmented fourth
⁴⁵₃₂	= 1.406…	major augmented fourth²
√2	= 1.414…	ideal tritone¹
⁶⁴₄₅	= 1.422…	minor diminished fifth²
³⁶₂₅	= 1.44	major diminished fifth

The fractions ⁴⁵₃₂ and ⁶⁴₄₅ are equally close to the ideal value of √2. They are both appropriate as just tritones.

equal temperament

FIX THIS PART

Less consonant (full of beats), but easier in terms of transposition.
Roughening of chords played increases musical versatility.

Intervals of the chromatic scale (equal temperament)
interval to	… tonic		… predecessor		semitone
tonic	2^0/12	= 1			equal tempered
minor second¹	2^1/12	= 1.059…	2^1/12	= 1.059…	equal tempered
major second²	2^2/12	= 1.122…	2^1/12	= 1.059…	equal tempered
minor third	2^3/12	= 1.189…	2^1/12	= 1.059…	equal tempered
major third	2^4/12	= 1.259…	2^1/12	= 1.059…	equal tempered
perfect fourth	2^5/12	= 1.334…	2^1/12	= 1.059…	equal tempered
tritone³	2^6/12	= 1.414…	2^1/12	= 1.059…	equal tempered
perfect fifth	2^7/12	= 1.498…	2^1/12	= 1.059…	equal tempered
minor sixth	2^8/12	= 1.587…	2^1/12	= 1.059…	equal tempered
major sixth	2^9/12	= 1.681…	2^1/12	= 1.059…	equal tempered
minor seventh	2^10/12	= 1.781…	2^1/12	= 1.059…	equal tempered
major seventh	2^11/12	= 1.887…	2^1/12	= 1.059…	equal tempered
octave	2^12/12	= 2	2^1/12	= 1.059…	equal tempered

standard pitch

FIX THIS PART

A pitch standard is created when a musical note (typically A₄) is assigned to a frequency in hertz (Hz) by some sort of organization. Pitch standards are human creations and are not tied to any natural phenomena (like standard gravity) or physical constant (like the speed of light in a vacuum). A pitch standard is an agreement among musicians at a given time and place that a particular pitch will be used as a reference for tuning. This is separate from the choice of tuning system within an octave (temperament), which was discussed earlier on this page.

musical standards

The story of the pitch wars.

Standards emerge.

Some pitch standards for the note A₄ (a.k.a. la)
year	f (Hz)	organization
1859	435	French ministry of state "Article 2. Ce diapason, donnant le la [A₄] adopté pour l'accord des instruments, est fixé à huit cent soixante-dix vibrations par seconde; il prendra le titre de diapason normal."
1891	435	International Pitch (Vienna)
1891	435	Piano Manufacturers' Association of New York and Vicinity
1975	440	International Organization for Standardization (ISO 16:1975)

scientific standards

Hermann Helmholtz liked mathematically special numbers.

Frequencies (Hz) of musical pitches, equal temperament, ISO 16:1975, A₄ = 440 Hz
pitch class			octave
pitch class			0	1	2	3	4	5	6	7	8
00	C	(do)	16.35	32.70	065.41	130.81	261.63	523.25	1046.50	2093.00	4186.01
01	C♯/D♭		17.32	34.65	069.30	138.59	277.18	554.37	1108.73	2217.46	4434.92
02	D	(re)	18.35	36.71	073.42	146.83	293.66	587.33	1174.66	2349.32	4698.64
03	D♯/E♭		19.45	38.89	077.78	155.56	311.13	622.25	1244.51	2489.02	4978.03
04	E	(mi)	20.60	41.20	082.41	164.81	329.63	659.26	1318.51	2637.02	5274.04
05	F	(fa)	21.83	43.65	087.31	174.61	349.23	698.46	1396.91	2793.83	5587.65
06	F♯/G♭		23.12	46.25	092.50	185.00	369.99	739.99	1479.98	2959.96	5919.91
07	G	(sol)	24.50	49.00	098.00	196.00	392.00	783.99	1567.98	3135.96	6271.93
08	G♯/A♭		25.96	51.91	103.83	207.65	415.30	830.61	1661.22	3322.44	6644.88
09	A	(la)	27.50	55.00	110.00	220.00	440.00	880.00	1760.00	3520.00	7040.00
10	A♯/B♭		29.14	58.27	116.54	233.08	466.16	932.33	1864.66	3729.31	7458.62
11	B	(ti)	30.87	61.74	123.47	246.94	493.88	987.77	1975.53	3951.07	7902.13

pythagoras' circle of fifths

Pythagoras of Samoswas the first to try and describe music with a mathematical system called the circle of fifths (or cycle of fifths). Start with the tonic. Multiply by a perfect fifth (³₂). Do it again. This puts you into the next octave. Bring it down an octave (multiply by ¹₂) so you can keep building your scale. Well…

³₂ × ¹₂ = ³₄

is the inverse of ⁴₃, an interval with a great deal of consonance. When you completely build the scale, the ratio ⁴₃ turns out to be the fourth interval in the series of eight that make up an octave. Thus the name fourth. The fifth and the fourth are inversions of one another in an octave. They are the only intervals that work out this way. That makes them special, in my mind, but the adjective that was ascribed to them was perfect. Thus the intervals ⁴₃ and ³₂ are called the perfect fourth and perfect fifth, respectively.

So here's the plan again: start with the tonic (¹₁), bring it up a perfect fifth (³₂) over and over again, take it down an octave (¹₂) when the product excedes an octave (²₁), and repeat until the ratio equals an octave (²₁).

We'll start on C since that's the middle of the modern piano. Behold!

Pythagorean intervals in sequence
interval	fractional value				decimal value
F:C	(³₂)⁻¹	(¹₂)⁻¹	=	⁴₃	= 1.3333333333333…
C:C	(³₂)⁰	(¹₂)⁰	=	¹₁	= 1 ← start here
G:C	(³₂)¹	(¹₂)⁰	=	³₂	= 1.5
D:C	(³₂)²	(¹₂)¹	=	⁹₈	= 1.125
A:C	(³₂)³	(¹₂)¹	=	²⁷₁₆	= 1.6875
E:C	(³₂)⁴	(¹₂)²	=	⁸¹₆₄	= 1.265625
B:C	(³₂)⁵	(¹₂)²	=	²⁴³₁₂₈	= 1.8984375
F♯:C♯	(³₂)⁶	(¹₂)³	=	⁷²⁹₅₁₂	= 1.423828125
C♯:C♯	(³₂⟯⁷	(¹₂⟯⁴	=	²¹⁸⁷₂₀₄₈	= 1.06787109375
G♯:C♯	(³₂⟯⁸	(¹₂⟯⁴	=	⁶⁵⁶¹₄₀₉₆	= 1.601806640625
D♯:C♯	(³₂⟯⁹	(¹₂⟯⁵	=	^19,683_16,384	= 1.2013549804688…
A♯:C♯	(³₂⟯¹⁰	(¹₂⟯⁵	=	^59,049_32,768	= 1.8020324707031…
E♯:C♯	(³₂⟯¹¹	(¹₂⟯⁶	=	^177,147_131,072	= 1.3515243530273…
B♯:C♯	(³₂⟯¹²	(¹₂⟯⁶	=	^531,441_262,144	= 2.0272865295410…

Oh oh. For those of you familiar with the piano, you will note that the errors occur at notes that do not exist on the keyboard. There is no such thing as a B♯ (A♭) or an E♯ (F♭).

Pythagorean intervals sorted and named
interval	decimal value	interval name
C:C	1	tonic
C♯:C♯	1.06787109375	minor second
D:C	1.125	major second
D♯:C♯	1.2013549804688…	minor third
E:C	1.265625	major third
F:C	1.3333333333333…	perfect fourth
E♯:C♯	1.3515243530273…	bigger than a perfect fourth
F♯:C♯	1.423828125	tritone*
G:C	1.5	perfect fifth
G♯:C♯	1.601806640625	minor sixth
A:C	1.6875	major sixth
A♯:C♯	1.8020324707031…	minor seventh
B:C	1.8984375	major seventh
B♯:C♯	2.0272865295410…	bigger than an octave

It should really be called the spiral of fifths since it never closes up. One complete lap around the circle should equal a whole number of octaves. That turns out to be 12 fifths and 7 octaves. But 12 fifths is a bit larger than seven octaves. This discrepancy is known as the pythagorean comma and is equal to…

B♯:C = (³₂)¹²(¹₂)⁷ = ^531,441_524,288 = 1.0136432647705…

Thus B♯ is a bit higher than C by about one-quarter of a semitone. This is a small difference that would be audible to trained ears were the two notes to be played in succession. Ordinary folks might not perceive the difference at all. Play them together as a part of a chord and your ears would not enjoy it.

f_beat = f_B♯ − f_C
f_beat = (256 Hz)(1.013643264771 − 1)
f_beat = 3.5 Hz

The dissonance would be audible as a 3.5 Hz beat for C = 256 Hz. No musician would ever want to be this far out of tune and no audience would want to listen to them.

pentatonic scales

Truth is ever to be found in simplicity, & not in y^e multiplicity & confusion of things.

Isaac Newton, ca. 1680

If the chromatic scale is all about multiplicity, the pentatonic scale is all about simplicity.

The semitone is one of the least consonant intervals. The chromatic scale is nothing but semitones it seems. The chromatic scale is the mother of all scales as a result. Semitones are the price you pay for that amount of versatility. Diatonic scales are considered simple, but they all have two semitones in them. That's two opportunities for a little bit of dissonance to sneak in. What does the risk averse musician do? If you fear semitones, or just don't like the way they sound, you could eliminate them. Music composed without semitones would be always mellow and never harsh. Since one of the purposes of music seems to be to "sooth a savage breast", let's do it. Let's just get rid of all the semitones.

Return to the C major diatonic scale again. (My favorite starting point.) Take the diatonic scale and drop the fourth and seventh pitch classes, the ones adjacent to the semitone intervals. This gives you a scale with five pitches — a pentatonic scale.

A pentatonic scale is like a diatonic scale in that it is made up of large and small intervals. In the diatonic scale, the large intervals are whole tones and the small intervals are semitones. In the pentatonic scale, the large intervals are thirds (minor thirds for this example) and the small intervals are whole tones. The ratios of adjacent pitches on this pentatonic scale follow this order…

This kind of pentatonic scale was built using the white keys of the piano. Starting only on the white keys, there are five allowed pentatonic modes. The modes are identified by the location of the missing intervals (the ones that were semitones in the diatonic scale).

Pentatonic modes on the white keys (just intonation)
	minor pentatonic	major pentatonic	egyptian	minor blues	major blues
pitch	A	C	D	E	G
tonic	¹₁	¹₁	¹₁	¹₁	¹₁
second	omit	⁹₈	¹⁰₉	omit	¹⁰₉
third	⁶₅	⁵₄	omit	⁶₅	omit
fourth	²⁷₂₀	omit	⁴₃	⁴₃	⁴₃
fifth	³₂	³₂	⁴⁰₂₇	omit	³₂
sixth	omit	⁵₃	omit	⁸₅	⁵₃
seventh	⁹₅	omit	¹⁶₉	⁹₅	omit
octave	²₁	²₁	²₁	²₁	²₁

The black keys can also be used to make a pentatonic scale.

This gives a slightly different ratio of adjacent pitches and one minor third that's a little bit flatter than normal (³²₂₇ = 1.185… is a little bit smaller than ⁶₅ = 1.2). It still has the same overall composition of three whole tones and two minor thirds.

Starting on the black keys, there are five pentatonic modes. Again, the modes are identified by the locations of the missing intervals.

Pentatonic modes on the black keys (just intonation)
	minor blues	major blues	minor pentatonic	major pentatonic	egyptian
pitch	A♯	C♯	D♯	F♯	G♯
tonic	¹₁	¹₁	¹₁	¹₁	¹₁
second	omit	⁹₈	omit	⁹₈	¹⁰₉
third	⁶₅	omit	³²₂₇	⁵₄	omit
fourth	²⁷₂₀	⁴₃	⁴₃	omit	⁴₃
fifth	omit	³₂	⁴⁰₂₇	³₂	³₂
sixth	⁸₅	⁵₃	omit	²⁷₁₆	omit
seventh	⁹₅	omit	¹⁶₉	omit	¹⁶₉
octave	²₁	²₁	²₁	²₁	²₁

On a piano: the black keys are pentatonic, the white keys are diatonic, and all the keys are chromatic.

scraps

major
minor
How about an octatonic, "string of pearls" scale?
Whole tone scale