Music has charms to sooth a savage breast, to soften rocks, or bend a knotted oak. I've read, that things inanimate have mov'd, and, as with living souls, have been inform'd, by magic numbers and persuasive sound.
William Congreve, 1697
synthesis and analysis
The distinction between music and noise is mathematical form. Music is ordered sound. Noise is disordered sound.
Music and noise are both mixtures of sound waves of different frequencies. The component frequencies of music are discrete (separable) and rational (their ratios form simple fractions) with a discernible dominant frequency. The component frequencies of noise are continuous (every frequency will be present over some range) and random (described by a probability distribution) with no discernible dominant frequency.
Sound is a longitudinal wave, which means the particles of the medium vibrate parallel to the direction of propagation of the wave. A sound wave coming out of a musical instrument, loudspeaker, or someone's mouth pushes the air forward and backward as the sound propagates outward. This has the effect of squeezing and pulling on the air, changing its pressure very slightly. These pressure variations can be detected by the ear drum (a light flexible membrane) in the middle ear, translated into neural impulses in the inner ear, and sent on to the brain for processing. They can also be detected by the diaphragm of a microphone (a light, flexible membrane), translated into an electrical signal by any one of several electromechanical means, and sent on to a computer for processing. The processing done in the brain is very sophisticated, but the processing done by a computer is relatively simple The pressure variations of a sound wave are changed into voltage variations in the microphone, which are sampled periodically and very rapidly by a computer and then saved as numbers.
A graph of microphone voltage vs. time (called a waveform) is a convenient way to use a computer to see sound. Before the rise of ubiquitous digital computers, waveforms were often analyzed electronically using an oscilloscope — a cathode ray tube with an electron beam that traced voltage as a function of time on a fluorescent glass screen. Oscilloscopes are basically simplified televisions with one purpose (to a draw time series or parametric graph) and one color (usually bright green). This task is easily mimicked by Twenty-first Century desktop, laptop, and tablet computers as well as smart phones. Oscilloscope applications on these devices often pay homage to their analog ancestors by using a green color scheme.
The simplest sound to analyze mathematically is the pure tone — one where the pressure variation is described by a single frequency. A pure tone would look like a sine curve when graphed oscilloscope style.
y = A sin 2πƒt
|y =||the instantaneous value of the microphone voltage (V), which is directly proportional to the variation in air pressure (∆P) due to sound waves impacting the microphone|
|A =||the amplitude, or maximum value of the waveform|
|2π =||a constant needed to get the units to work out|
|ƒ =||the frequency of the pure tone|
|t =||time, of course|
|Oscilloscope traces of pure tones|
|40 Hz pure tone||100 Hz pure tone||315 Hz pure tone|
Music in its simplest form is monotonic; that is, composed only of pure tones. Monotonic music is dull and lifeless like a 1990s ringtone (worse than that even); like a 1970s digital watch alarm (now we're talking); like an oscillating circuit attached to a speaker built by a college student in an introductory physics class (so primitive). Real music, however, is polytonic — a mixture of pure tones played together in a manner that sounds harmonious. A sound composed of multiple frequencies like that produced by a musical instrument or the human voice would still be periodic, but would be more complex than just a simple sine curve.
|Oscilloscope traces for various instruments|
|bass voice singing||tenor voice singing||soprano voice singing|
|conga drum||high hat cymbal||woodblock|
The human voice and musical instruments produce sounds by vibration. What vibrates determines the type of instrument.
|bell, cymbal, musical saw,
wood block, xylophone
|drums, kazoo, human voice|
|strings (violin, guitar, harp), piano|
|woodwinds (saxophone, flute),
brass (trumpet, tuba), organ
Like many other mechanical systems, musical instruments vibrate naturally at several related frequencies called harmonics. The lowest frequency of vibration, which is also usually the loudest, is called the fundamental. The higher frequency harmonics are called overtones. The human auditory system perceives the fundamental frequency of a musical note as the characteristic pitch of that note. The amplitudes of the overtones relative to the fundamental give the note its quality or timbre — pronounced in English as tæmbər or in quasi-French by English speakers as tɛ̃br with a nasal ɛ̃ for the medial e and silence for the final e. Timbre is one of the features of sound that enables us to distinguish a flute from a violin and a tuba from a timpani.
Recall that when waves meet they don't collide like material objects, they pass through each other like spectres — they interfere. Interfering waves combine by the principle of linear superposition — basically, just add the values of one function to the values of another function everywhere in mathematical space. With the right combination of sine and/or cosine functions, you can make functions with all kinds of shapes (as long as they're functions). Several examples were illustrated earlier in this book. Here they are again…
|square wave||sawtooth wave||triangle wave|
The act of combining pure tones together to produce a complex waveform is called additive synthesis. Non-electronic instruments do this naturally. Electronic instruments (if they want to be taken seriously) are designed with additive synthesis in mind. The analog synthesizer is a good example of this.
Generate a sinusoidal electric signal of with an oscillator — a fairly simple electric circuit composed of a capacitor (C) and an inductor (L), also known as an LC circuit. Using a second oscillator generate a second signal at a different frequency. The second harmonic would be a wise choice. Add a third, fourth, fifth, sixth and seventh oscillator for the corresponding higher harmonics. Synchronize them electronically so that when the pitch of the fundamental oscillator changes, the frequencies of the overtone oscillators follow along. Adjust the relative amplitude of each oscillator to produce a sound with the desired timbre. Attach the circuit to an amplifier and a loudspeaker so it can be heard. Attach the fundamental oscillator to a control device that turns it on and off and changes its frequency as needed. Make your control device look like a piano keyboard so musicians will use your device as a musical instrument. Call it a synthesizer since it uses the principles of additive synthesis. Let the Seventies begin.
It is mathematically possible to transform any waveform from a continuous sequence of values on a time-series graph into an infinite series of discrete sine and/or cosine functions. The process is called spectral analysis or Fourier analysis after its inventor, the French mathematician and physicist, Joseph Fourier (1768–1830). Taking a sound apart through spectral analysis is the inverse process of putting a sound together through additive synthesis.
Start with an arbitrary periodic function and assume that it was made from the linear superposition of an infinite number of sine and cosine functions in a harmonic series. What coefficients should each harmonic be multiplied by to give the desired function? The process for answering this question is called a Fourier transform and it is best left to a mathematician to explain.
Any periodic function ƒ(t) with a period of T can be written like this…
|ƒ(t) = a0 + ∑ ancos||⎛
|+ ∑ bnsin||⎛
…and n = 1, 2, 3,… and that's all I want to say about that.
Fortunately for us, relatively simple algorithms for solving this problem were discovered soon after Fourier invented the technique, and computer scientists have been coding and sharing fast Fourier transforms since the 1960s. Audio applications with FFT are easily found online for many computing platforms — including smart phones, which means that sound spectral analysis is now literally with everyone's grasp.
The graph above shows the spectral analysis of a female voice. Sounds of all sorts of different frequencies are being produced — some more intensely than others. It appears that this particular voice was strongest at 270 Hz (C♯4 for those of you who know music) and multiples of 270 Hz (540 Hz, 810 Hz, 1080 Hz, 1350 Hz, 1620 Hz, and 1890 Hz). This graph has been normalized, which exaggerated the peaks at the higher frequencies so they could be seen. I like this graph because the spacing between peaks is just so perfect. This kind of structure is what makes a sound musical. Let's look at some more examples.
A flute is essentially a tube that is open at both ends. Air is blown across one end and sound comes out the other. A spectral analysis confirms this. The harmonics are all whole number multiples of the fundamental frequency (436 Hz, a slightly flat A4 — a bit lower in frequency than is normally acceptable). Note how the second harmonic is nearly as intense as the fundamental. This strong second harmonic is part of what makes a flute sound like a flute.
A recorder is also a tube with two open ends. It produces a sound similar to a flute, but not exactly the same. Again the harmonics are whole number multiples of the fundamental frequency (923 Hz, a very sharp A5 — much higher in frequency than is normally acceptable), but for some reason the second harmonic is nearly non existent. This nearly missing second harmonic is part of what makes a recorder sound like a recorder and not sound like a flute.
A tuning fork is forked; that is to say, it splits from its handle into two branches called tines. Each tine is fixed to the handle at one end, but is free to vibrate at the other. As a result, one would expect to find only those harmonics that were odd multiples of the fundamental in the spectrum of a tuning fork. This is what the spectral analysis shows. The even harmonics are present, but are they are extremely weak and are probably due to the sympathetic vibrations of something nearby. This spectra was produced by striking a large, demonstration-size tuning fork (not the one pictured above) with an excessively heavy blow. Tuning forks should always be tapped lightly and on a resilient surface. Doing so reduces the intensity of the "ping" overtones, which is a desirable thing. An ideal tuning fork would vibrate at just one frequency. The tuning fork used in this experiment was rated at 256 Hz, but the spectral analysis software picked up 259 Hz for the fundamental. Is the tuning fork out of tune or is the software in error?
Music is sound with a discrete structure. Noise is sound with a continuous structure. Music is composed of sounds with a fundamental frequency and overtones. Noise is composed of sounds with frequencies that range continuously in value from as low as you can hear to as high as you can hear (not necessarily at equal intensity however). Music is described mathematically by an infinite sum of sines and cosines multiplied by appropriately valued coefficients (infinite mathematically, but in practice only a handful of overtones really matter). Noise is described by a spectral power distribution, much like the statistical distributions of kinetic molecular theory. Music is ordered. Noise is random.
Noise is what you hear when you tune an analog radio or television to an empty frequency. It's the overall sound of rain falling on leaves, soda bubbling in a glass, air escaping from a tire, or a crowd applauding.
"Noisy" describes some voiceless consonants used in English better than "musical".
|/s/||[s]||sin||voiceless coronal sibilant|
|/sh/||[ʃ]||shin||voiceless palato-alveolar sibilant|
|/f/||[f]||fin||voiceless labiodental fricative|
|/th/||[θ]||thin||voiceless dental non-sibilant fricative|
|/h/||[h]||hello||voiceless glottal transition|
Noise might not be periodic, but it can still be analyzed with a fast Fourier transform — as long as we only examine a finite amount of it. Since infinite quantities don't exist in the real world, this isn't a problem. Mathematics allows for more possibilities than physicists will never encounter.
|Spectral analysis of…|
|white noise||pink noise||concert applause|
|Frequency on the horizontal axis. Intensity on the vertical axis.|
Noise might not be ordered, but that doesn't mean it can't be described. Frequency is to sound as color is to light. We hear different frequencies of sound as different pitches (A, B, C, D, E, F, G; for example). We see different frequencies of light as different colors (red, orange, yellow, green, blue, and violet; for example). The analogy is not perfect. (What analogy is?) The notes of the musical scale repeat themselves for every doubling of frequency (a topic I'll come back to). The frequencies of visible light span so narrow a range, they never get a chance to double. The frequencies of the notes of a musical scale are related by simple numerical fractions. The frequency bands associated with specific color names show no mathematical relationships. This analogy isn't looking very promising.
In much the same way that most sounds we hear are polytonic (composed of many frequencies), most light we see is polychromatic (composed of many frequencies). Sources of light that are hot enough (like the sun) will emit a mixture of frequencies that span the entire visible spectrum. The color of this light is white — a mixture of red, orange, yellow, green, blue, violet, and everything in between. White noise is the audio equivalent of white light. It's a combination of all the frequencies that span the entire audible spectrum. White noise is a mixture of all the musical pitches with names and all the pitches in between that aren't named. This is the informal definition of white noise.
Formally, white noise is a sound with a flat frequency spectrum; that is, it transmits power equally at all frequencies. It is a mathematical ideal with a representation something like this…
p(ƒ) = constant
|p(ƒ) =||the value of the power spectral density measured in W/Hz (used for theoretical discussions) or the relative power spectral density in V2/Hz (what normally gets used in practice)|
|ƒ =||any frequency of sound in the range of human hearing|
|P =||the total power output of the source (theory) or some quantity proportional to that (practice)|
- Spectral analysis applications process the voltage of the signal coming from a microphone, not the power of the sound detected by the microphone. Electric power is proportional to the square of voltage at constant resistance, so V2/Hz are measured instead of W/Hz. Both quantities have the same mathematical structure, which is the important thing in audio analysis.
- Frequencies are limited to a definite range. This keeps the integral from running away to infinity. A rectangle of any height that is infinitely wide would contain infinite area. Since we can't let this happen, we have to assign limits to the spectrum. This is something that makes the white noise distribution unrealistic. Real statistical distributions tend to be curves that taper off nicely to zero. They don't start and end abruptly.
- White noise is a simple mathematical model that is so simple, it's unrealistic. It is a good way to introduce the topic of power spectral density, however. It is also a convenient reference sound for testing audio equipment.
The human auditory system is logarithmic when it comes to perceiving sounds. This is true when it comes to amplitude (which is related to intensity or loudness and was discussed earlier in this book) and frequency (which will be discussed now). Listen to a sequence of two pure tones with a difference of 100 Hz between them, say 100 Hz and 200 Hz. Then listen to another sequence with the same linear difference but starting on a much higher frequency, say 800 Hz and 900 Hz. The perceived increase in frequency of the second pair seems smaller than the first pair even though the absolute size of the increase is the same.
|100 Hz / 200 Hz||800 Hz / 900 Hz|
A doubling from 100 Hz to 200 Hz sounds similar to a doubling from 800 Hz to 1600 Hz.
|100 Hz / 200 Hz||800 Hz / 1600 Hz|
Likewise a one eighth increase from 100 Hz to 112.5 Hz sounds the same as a one eighth increase from 800 Hz to 900 Hz.
|100 Hz / 112.5 Hz||800 Hz / 900 Hz|
We tend to group sounds perceptually into bands that increase by successive powers of 2. A mathematician would call this a logarithmic increase. Musicians call the bands octaves. Audio technicians developed pink noise as a response to this reality.
Formally, pink noise is a sound with a power spectral density proportional to the inverse of frequency, or 1/ƒ. Because of this it is also known as 1/ƒ noise (one-over-f noise).
Pink noise, like white noise, as a model must be limited to some range of frequencies. If not, our source of sound would be radiating infinite power.
|k||df = k||⎡
|P =||k (log ∞ − log 0)|
|P =||k (∞ − 1) = ∞|
A pink noise signal transmits energy equally over all octaves (logarithmic intervals). Let's test this essential characteristic. Compute the power radiated in a frequency band spanning the octave ƒ = (a, 2a)…
|k||df = k||⎡
|P =||k (log 2a − log a)|
|P =||k log 2|
Compare that to the power radiated in the next octave higher ƒ = (2a, 4a) and you will see it is the same…
|k||df = k||⎡
|P =||k (log 4a − log 2a)|
|P =||k log 2|
Following the analogy that sound is to frequency as light is to color. Pink noise is pink because it a little bit white, since all audible frequencies are present, and also a little bit red, since it is mostly composed of low frequencies. (Low frequency visible light is seen as red.) Pink noise sounds richer than white noise — like a waterfall or blowing wind. White noise sounds brighter, but also more harsh than pink noise — like steam escaping a radiator or, as a famous biologist once said, like "innumerable mice eating Rice Crispies". You might also like to listen to some applause for a non-mathematical, real world example of noise. You deserve it.
|Hold your cursor over the spectra to hear the sound of…|
|white noise||pink noise||concert applause|
consonance & dissonance
Sounds that lend themselves to music are those that sound pleasant (or at least tolerable) when played in sequence (called melody) or in unison (called harmony).
high degrees of consonance (shared harmonics)
Consonances are sometimes described as being inherently more pleasant to the ear and dissonances as less pleasant.
Musical notes that are consonant share a large number of overtones. In the just intonation scale the octave is the most consonant interval followed by the perfect fifth, perfect fourth, major third and sixth, and major seventh. No other set of intervals shares such a high degree of consonance.
Notes separated by an octave sound similar — like two people with different voices trying to sing the same note.
overtones line up
Helmholtz used the German word oberton which literally means "upper tone". Someone transliterated the word (instead of translating it) and oberton became overtone in English by accident. If Helmholtz had meant overtone he would have said überton.
- pure tones
- complex tone
- evil 7th overtone
- who cares after the 8th overtone
This is where the physics ends and the music theory and abstract algebra (group theory and combinatorics) start to take over.
The foundation of music is the musical note: a combination of pitch (the musical word for frequency) and duration (the ordinary word for amount of time).
Did I say music was based on notes? That's not true. Real music is based on intervals (the ratio of any two pitches) with high degrees of consonance (shared harmonics).
A scale is set of pitches (pitch classes, more precisely) arranged in order of increasing frequency from which notes are selected and arranged to create a musical composition. The exact pitches used in any scale are determined by the starting pitch (called the tonic), a set of rules for generating intervals (called a tuning system), and the pattern of intervals selected (sometimes called the mode).
Western music theorists identified eight basic intervals defined by their relative size…
- tonic or unison — an interval of one to one, perfect consonance
- fifth — the third most consonant interval
- octave — an interval of two to one, half of all overtones match
The intervals of adjacent pitches in a scale often come in two sizes called whole tones and semitones (or half tones). The relationship between these should be obvious — two semitone intervals applied in succession equal a whole tone — but it turns out to be only approximately true in most tuning systems. Combinations of tones and semitones may also be named by size…
- semitone — typically the smallest interval
- whole tone — two semitones
- ditone — two whole tones (a term that's rarely used)
- tritone — three whole tones, half an octave
- 6 whole tones or 12 semitones equal one octave.
Each interval also needs an adjective to describe its quality…
- perfect — an interval that is an inversion of another
- major — the larger of two nearly equal intervals
- minor — the smaller of two nearly equal intervals
- augmented — a semitone higher than a major or perfect interval
- diminished — a semitone lower than a minor or perfect interval
Many, many, many interval patterns are used in musical composition. (That's too many manys.) A convenient way to organize them us by the number of intervals they contain. Here's a list of the interval patterns that will be discussed in this book.
- whole tone
- string of pearls
Since notes separated by an octave sound similar, a musical scale can be completely described by a set of intervals with ratios ranging from 1:1 to 2:1. If you need notes above the octave, just double the intervals. If you need notes higher than that, double the intervals again. If you need notes lower than the tonic, take all your intervals and halve them. If you need notes lower than that, halve all the intervals again. You get the idea. This is called octave duplication or circularity of pitch.
Pitches separated by an octave form a pitch class. They are always named using an uppercase letter (A, B, C, D, E, F, G) and sometimes include a modifier symbol called an accidental (♯, ♭, ♮).
|♯||sharp||raises the pitch by one semitone|
|♭||flat||lowers the pitch by one semitone|
|♮||natural||cancels any previously applied accidentals|
Adding a sharp to a pitch is the same as adding a flat to the next higher pitch (usually). F♯ is the same as G♭, for example. This is known as enharmonic equivalence. Similarly, adding a double sharp or a double flat to a lettered pitch changes the pitch by one letter (usually). C♯♯ is the same as D, and D♭♭ is the same as C. I keep saying "usually" because there are exceptions around the notes B♯, C♭, E♯, F♭. A piano keyboard is a familiar way to display all the pitch classes.
Enharmonic equivalence is not a given. There are some tuning systems where the distinction between F♯ and G♭ is real. That's part of the reason we have two names for the same pitch today. Thankfully, most of us will never need to know what that distinction is. I have read explanations, but I have never understood them. I think most formally trained Western musicians today would probably say the same thing. Fans of Renaissance and Baroque music would be the exception.
Want to know whether you should say A♯ or B♭? Just look at the pitch classes of the scale used to create your musical composition. If you say any letter twice, you've probably done it wrong. For example…
F, G, A, B♭, C, D, E
is correct, but…
F, G, A, A♯, C, D, E
…is wrong because the letter A appeared twice. This is not a perfect rule, but it works for all the diatonic scales — the most used scales in Western music. (Non-Western music theory is beyond the scope of this book. The same goes for microtonal music — music that uses intervals smaller than a semitone)
|A minor diatonic||A, B, C, D, E, F, G||heptatonic|
|B♭ major diatonic||B♭, C, D, E♭, F, G, A||heptatonic|
|C major diatonic||C, D, E, F, G, A, B||heptatonic|
|C♯ string of pearls||C♯, D♯, E, F♯, G, A, B♭, C||octatonic|
|D♭ whole tone||D♭, E♭, F, G, A, B||hexatonic|
|E minor blues||E, G, A, C, D||pentatonic|
|F chromatic||F, F♯, G, G♯, A, A♯, B, C, C♯, D, D♯, E||dodecatonic|
An interval is the ratio of any two pitches in a scale. Consonance occurs when the overtones of one pitch coincide with the overtones of another. The intervals with the highest degree of consonance are ratios of small whole numbers (like 2, 3, 5) or small powers of these (like 4, 8, 9, 16, 25, 27) or small products of these (like 6, 10, 12, 15, 18, 20). The more prime factors present in a ratio, the less consonant it sounds. A perfect fifth (3/2) is more consonant than a major seventh (15/8 = 3·5/2·2·2) which is much more consonant than a major chromatic semitone (135/128 = 3·3·3·5/2·2·2·2·2·2·2). A tuning system built on these principals is described as having just intonation. The word "just" here implies that the pitches generated are the "right" ones.
Just intonation is a simple way to build a musical scale. It's the natural tuning system people use when they sing together unaccompanied by musical instruments (a cappella). There are many types of scales with just intonation. I will discuss a few of the basic ones. I am showing them on a piano keyboard since it is a familiar instrument, but pianos are not normally tuned this way. (Pianos use equal temperament, which will be described next.)
The scale shown below is a type of just intonation scale starting on C. Since the third interval (5/4 = 1.25) is "large", the scale is major. Music performed in a major key tends to sound bright, happy, or triumphant. This is true whether the scale was constructed using just intonation or some other scheme. This is probably a function of culture more than anything else. (Happiness isn't something physicists go around measuring — although we do discuss brightness from time to time and in a different context.)
Scales like this one are given the name diatonic — from the Greek phrase δια τονικη (dia tonike), which means "across the tones". The implication is that all the pitches you need are contained in this scale. That's not quite right, however. (More on this later.)
This particular scale is also heptatonic, which means it contains seven intervals: three major whole tones (9/8), two minor whole tones (10/9), and two semitones (16/9). The ratios of adjacent pitches in a major scale follow this order…
- major whole tone
- minor whole tone
- major whole tone
- minor whole tone
- major whole tone
There's no reason I have to start playing a scale on C like the diagrams above show. Why not start playing on A as the diagrams below show? This gets you through an octave with intervals that are still fairly nice. Since the third interval (6/5 = 1.2) of this scale is "small", the scale is said to be minor. Music performed in a minor key tends to sound dark, sad, moody, or introspective. Why this happens is outside the realm of physics. It also contains a particularly harsh interval that is close to a fourth (27/20 ≈ 4/3). Since the fourth is a perfect interval, that must make this an imperfect fourth. It's an interesting example of how being close is not the same thing as being close enough. A perfect fourth sounds pleasant. An imperfect fourth sound harsh.
This scale is also heptatonic and diatonic. The ratios of adjacent pitches in a minor scale follow this order this order…
- major whole tone
- major whole tone
- minor whole tone
- major whole tone
- minor whole tone
In general, a diatonic scale is any scale that can be played on the white keys of a piano. Since there are seven intervals in a diatonic scale, there are seven pitches to start out on on a piano (A, B, C, D, E, F, and G). Starting on C gives you the major diatonic scale. Starting on A gives you a minor diatonic scale. The remaining five scales are used to lesser degrees in Western music past and present — medieval Christian chant being the example most frequently cited. All seven diatonic scales (or modes as they are called) have names corresponding to regions in ancient Greece. The assignment of Greek place name to diatonic mode is completely arbitrary, however. If there ever was any meaning to this naming convention, it's been lost to time.
|*||minor scale||27/20||imperfect fourth||40/27||imperfect fifth|
|†||major scale||45/32||augmented fourth||27/16||large major sixth|
|32/27||small minor third||64/45||diminished fifth||9/5||large minor seventh|
Let's return to the the C major scale. I find that one more interesting because of what happens when you invert the intervals. Instead of moving from the tonic to the pitch, let's move from the pitch to the octave.
To go from C to G on this scale is a jump up of a fifth (multiply by 3/2). From G to the next C is a jump of a fourth (multiply by 4/3). This means the fourth is the inversion of the fifth.
and the fifth is the inversion of the fourth, which sounds like a statement of the obvious.
In general, any pair of intervals that together results in a jump of one octave is called an inversion.
interval × inversion = octave
That means a third and a sixth are an inversion.
and so are a sixth and a third, which should also be obvious except…
Wait a minute. What just happened? The first first time I wrote the third, I wrote 5/4. The second time I wrote 6/5. Last time I checked, those were different numbers. I did something similar with the sixth (8/5 ≠ 5/3). Welcome to the world of just intonation, where intervals in one direction on a scale do not necessarily equal those in the inverse direction.
The tonic, fourth, fifth and octave are not affected by inversion, which is why they are called perfect intervals. The fourth and fifth are given this as a title — as in, "a perfect fourth" or "a perfect fifth". For some reason no one says "a perfect tonic". Maybe because it sounds like the name of a cocktail. No one says "a perfect octave" either, but I don't have a joke for that one.
The second, third, sixth, and seventh are made smaller by inversion. An interval used to raise a pitch above the tonic is always larger than the same interval used to take the new pitch to the octave. As is tradition, larger intervals are said to be major and smaller ones minor.
|tonic||1/1||= 1||octave||2/1||= 2|
|major second||9/8||= 1.125||minor seventh||16/9||= 1.777…|
|major third||5/4||= 1.25||minor sixth||8/5||= 1.6|
|perfect fourth||4/3||= 1.333…||perfect fifth||3/2||= 1.5|
|tritone*||?||= ?||tritone*||?||= ?|
|perfect fifth||3/2||= 1.5||perfect fourth||4/3||= 1.333…|
|major sixth||5/3||= 1.666…||minor third||6/5||= 1.2|
|major seventh||15/8||= 1.875||minor second†||16/15||= 1.066…|
|octave||2/1||= 2||tonic||1/1||= 1|
|*||augmented fourth or diminished fifth|
Something interesting happens when you merge the perfect, major, and minor intervals together into a scale. The minor intervals fill in the "spaces" between pitches separated by a whole tone. (The exception to this is the gap between the perfect fourth and the perfect fifth.) The resulting scale has twelve intervals, each one a semitone away from its neighbor. Such a scale is said to be chromatic — from χρωματος (chromatos), the Greek word for color. Since color is to vision as pitch is to hearing, the metaphor is entirely appropriate. The chromatic scale is more "colorful" than the diatonic scale.
The intervals of the chromatic scale are too numerous to fit nicely on my piano keyboard illustration. Instead, here's a table showing both the intervals with respect to the tonic and the intervals with respect to the preceding pitch in the just intonation chromatic scale.
|interval to||… tonic||… predecessor||semitone|
|minor second1||16/15||= 1.066…||16/15||= 1.066…||diatonic|
|major second2||9/8||= 1.125||135/128||= 1.054…||major|
|minor third||6/5||= 1.2||16/15||= 1.066…||diatonic|
|major third||5/4||= 1.25||25/24||= 1.041…||minor|
|perfect fourth||4/3||= 1.333…||16/15||= 1.066…||diatonic|
|tritone3||64/45||= 1.422…||16/15||= 1.066…||diatonic|
|perfect fifth||3/2||= 1.5||135/128||= 1.054…||major|
|minor sixth||8/5||= 1.6||16/15||= 1.066…||diatonic|
|major sixth||5/3||= 1.666…||25/24||= 1.041…||minor|
|minor seventh||16/9||= 1.777…||16/15||= 1.066…||diatonic|
|major seventh||15/8||= 1.875||135/128||= 1.054…||major|
|octave||2/1||= 2||16/15||= 1.066…||diatonic|
|2||major whole tone|
|3||augmented fourth or diminished fifth|
A word about semitones. There are several of them. An interval of 16/15 is called a diatonic semitone. It separates the major third from the perfect fourth and the major seventh from the octave in the major diatonic scale. When we merged the perfect and major intervals of the diatonic scale with their inversions (the minor intervals) the diatonic semitone popped up in four new locations — tonic to minor second, major second to minor third, perfect fifth to minor sixth, and major sixth to minor second. It's a popular interval. The chromatic scale has two other semitones — a larger or major chromatic semitone equal to 135/128 and a smaller or minor chromatic semitone equal to 25/24.
|25/24||= 1.041…||minor chromatic semitone|
|135/128||= 1.054…||major chromatic semitone|
|16/15||= 1.066…||diatonic semitone|
|10/90||= 1.111…||minor whole tone|
|9/8||= 1.125||major whole tone|
Combinations of chromatic and diatonic semitones make whole tones — majors make majors and minors make minors. In a sense, the chromatic semitones are inversions of the diatonic semitone over the span of a whole tone.
Which brings us to the tritone — the bad boy of the chromatic scale, stuck between the beauty and perfection of the fourth and fifth. We know something belongs there (the size of the interval demands it) but what? All the other intervals arose naturally. That should really say "naturally" in scare quotes. How natural is this process? You shall be born by the forced union of the perfect fourth and a semitone… But which one?
Again I ask… Which one? The perfect fifth and the perfect fourth are inversions of one another. The tritone lies between these, which means it would have to be its own inversion. Do any of these satisfy that recommendation? If so that's a good sign this would be our tritone.
Well that's no good. Two of the tritones are inversions of each other, which doesn't do us any good, and the inversion of the third one looks like it just gave us another tritone. We're going about this the wrong way. Instead of testing fractions hoping one will work out, maybe we should use algebra and just go to the answer directly. What number, x, equals itself when divided into 2?
|2||= x||⇒||x = √2|
I didn't expect the square root of two.
Dun, dun, duuuuuunnnnn! Nobody expects the square root of two! Its chief weapon is surprise, fear and surprise; two chief weapons; fear, surprise, and irrationality! Uh, among its chief weapons are: fear, surprise, irrationality, and an almost fanatical inability to be written as a fraction! Uh, I'll come in again...
|25/18||= 1.388…||minor augmented fourth|
|45/32||= 1.406…||major augmented fourth|
|√2||= 1.414…||ideal tritone1|
|64/45||= 1.422…||minor diminished fifth2|
|36/25||= 1.44||major diminished fifth|
After playing with these intervals for what seems like days, I think I'll just go with 64/45 — the value closest to the √2 ideal. It seems to play nicely with the other intervals — or at least is doesn't torture them much.
Truth is ever to be found in simplicity, & not in ye multiplicity & confusion of things.
Isaac Newton, ca. 1680
If the chromatic scale is all about multiplicity, the pentatonic scale is all about simplicity.
The semitone is one of the least consonant intervals. The chromatic scale is nothing but semitones it seems. The chromatic scale is the mother of all scales as a result. Semitones are the price you pay for that amount of versatility. Diatonic scales are considered simple, but they all have two semitones in them. That's two opportunities for a little bit of dissonance to sneak in. What does the risk averse musician do? If you fear semitones or just don't like them, you could eliminate them. Music composed without semitones would be always mellow and never harsh. Since one of the purposes of music seems to be to "sooth a savage breast", let's do it. Let's just get rid of all the semitones.
Return to the C major diatonic scale again. (My favorite starting point.) Take the diatonic scale and drop the fourth and seventh pitch classes, the ones adjacent to the semitone intervals. This gives you a scale with five pitches — a pentatonic scale.
A pentatonic scale is like a diatonic scale in that it is made up of large and small intervals. In the diatonic scale, the large intervals are whole tones and the small intervals are semitones. In the pentatonic scale, the large intervals are thirds (minor thirds for this example) and the small intervals are whole tones. The ratios of adjacent pitches on this pentatonic scale follow this order…
- major whole tone
- minor whole tone
- minor third
- minor whole tone
- minor third
This kind of pentatonic scale was built using the white keys of the piano. Starting only on the white keys, there are five allowed pentatonic modes. The modes are identified by the location of the missing intervals (the ones that were semitones in the diatonic scale).
|27/20||imperfect fourth||40/27||imperfect fifth||9/5||large minor seventh|
The black keys can also be used to make a pentatonic scale.
This gives a slightly different ratio of adjacent pitches and one minor third that's a little bit flatter than normal (32/27 = 1.185… is a little bit smaller than 6/5 = 1.2). It still has the same overall composition of three whole tones and two minor thirds.
- major whole tone
- small minor third
- major whole tone
- minor whole tone
- minor third
Starting on the black keys, there are five pentatonic modes. Again, the modes are identified by the locations of the missing intervals.
|32/27||small minor third||27/20||imperfect fourth||40/27||imperfect fifth|
|27/16||large major sixth||9/5||large minor seventh|
On a piano: the black keys are pentatonic, the white keys are diatonic, and all the keys are chromatic.
- Less consonant (full of beats), but easier in terms of transposition.
- roughening of chords played, increases musical versatility
The chromatic circle is a geometric way to display all the pitch classes. This way is sometimes better for showing the enharmonic relationships. The space between adjacent pitch classes is a semitone.
- How about an octatonic, "string of pearls" scale?
- Whole tone scale
pythagoras' circle of fifths
Pythagoras of Samos (582–496 BCE) Greece was the first to try and describe music with a mathematical system called the circle of fifths (or cycle of fifths). Start with the tonic. Multiply by a perfect fifth (3⁄2). Do it again. This puts you into the next octave. Bring it down an octave (multiply by 1⁄2) so you can keep building your scale. Well…
3⁄2 × 1⁄2 = 3⁄4
is the inverse of 4⁄3, an interval with a great deal of consonance. When you completely build the scale, the ratio 4⁄3 turns out to be the fourth interval in the series of eight that make up an octave. Thus the name fourth. The fifth and the fourth are inversions of one another in an octave. They are the only intervals that work out this way. That makes them special, in my mind, but the adjective that was ascribed to them was perfect. Thus the intervals 4⁄3 and 3⁄2 are called the perfect fourth and perfect fifth, respectively.
So here's the plan again: start with the tonic, bring it up a perfect fifth, take it down a perfect fourth, and repeat until the ratio equals an octave (2⁄1).
We'll start on C since that's the middle of the modern piano. Behold!
|F: C =||(3/2)−1||(1/2)0||=||2||= 0.6666666666666…|
|C: C =||(3/2)0||(1/2)0||=||1||= 1 ← start here|
|G: C =||(3/2)1||(1/2)0||=||3||= 1.5|
|D: C =||(3/2)2||(1/2)1||=||9||= 1.125|
|A: C =||(3/2)3||(1/2)1||=||27||= 1.6875|
|E: C =||(3/2)4||(1/2)2||=||81||= 1.265625|
|B: C =||(3/2)5||(1/2)2||=||243||= 1.8984375|
|F♯: C =||(3/2)6||(1/2)3||=||729||= 1.423828125|
|C♯: C =||(3/2)7||(1/2)4||=||2187||= 1.06787109375|
|G♯: C =||(3/2)8||(1/2)4||=||6561||= 1.601806640625|
|D♯: C =||(3/2)9||(1/2)5||=||19683||= 1.2013549804688…|
|A♯: C =||(3/2)10||(1/2)5||=||59049||= 1.8020324707031…|
|E♯: C =||(3/2)11||(1/2)6||=||177147||= 1.3515243530273…|
|B♯: C =||(3/2)12||(1/2)6||=||531441||= 2.0272865295410…|
Oh oh. For those of you familiar with the piano, you will note that the errors occur at notes that do not exist on the keyboard. There is no such thing as a B♯ (a.k.a. A♭) or an E♯ (a.k.a. F♭).
|interval||pythagorean ratio||interval name|
|C♯||: C||1.06787109375||minor second|
|D||: C||1.125||major second|
|D♯||: C||1.2013549804688…||minor third|
|E||: C||1.265625||major third|
|F||: C||1.3333333333333…||perfect fourth|
|E♯||: C||1.3515243530273…||bigger than a perfect fourth|
|G||: C||1.5||perfect fifth|
|G♯||: C||1.601806640625||minor sixth|
|A||: C||1.6875||major sixth|
|A♯||: C||1.8020324707031…||minor seventh|
|B||: C||1.8984375||major seventh|
|B♯||: C||2.0272865295410…||bigger than an octave|
It should really be called the spiral of fifths since it never closes up. One complete lap around the circle should equal a whole number of octaves. That turns out to be 12 fifths and 7 octaves. But 12 fifths is a bit larger than seven octaves. This discrepancy is known as the pythagorean comma and is equal to…
|B♯: C =||(3/2)12||(1/2)7||=||531441||= 1.0136432647705…|
Thus B♯ is a bit higher than C by about one-quarter of a semitone. This is a small difference that would be audible to trained ears were the two notes to be played in succession. Ordinary folks might not perceive the difference at all. Play them together as a part of a chord and your ears would definitely not enjoy it.
|ƒB♯ − ƒC||= ƒbeat|
|(256 Hz)(1.013643264771 − 1)||= 3.5 Hz|
The dissonance would be audible as a 3.5 Hz beat for C = 256 Hz. No musician would ever want to be this far out of tune and no audience would want to listen to them.