The distinction between music and noise is mathematical form. Music is ordered sound. Noise is disordered sound.
Music and noise are both mixtures of sound waves of different frequencies. The component frequencies of music are discrete (that is separable) and rational (their ratios form simple fractions). Noise is a mixture of sounds with no discernible dominant frequency. The component frequencies of noise are continuous (every frequency will be present over some range). The resulting superposition of waves is then effectively not periodic.
Sound is a longitudinal wave, which means the particles of the medium vibrate parallel to the direction of propagation of the wave. A sound wave coming out of a musical instrument, loudspeaker, or someone's mouth pushes the air forward and backward as the sound travels forward. This has the effect of squeezing and pulling on the air, changing its pressure very slightly. These pressure variations can be detected by the ear drum (a light flexible membrane) in the middle ear, translated into neural impulses in the inner ear, and send on to the brain for processing. They can also be detected by the diaphragm of a microphone (a light, flexible membrane), translated into an electrical signal by any one of several electromechanical means, and transmitted on to a computer for processing. The processing done in the brain is very sophisticated, but the processing done by a computer is relatively simple The pressure variations of a sound wave are changed into voltage variations in the microphone, which are sampled periodically and very rapidly by a computer and saved as numbers.
A graph of microphone voltage vs. time is a convenient way to use a computer to see sound. Before the rise at the end of the Twentieth Century of ubiquitous digital computers, sound was often analyzed electronically using an oscilloscope — a cathode ray tube that traced out voltage as a function of time by swiping an electron beam across a fluorescent glass screen. Oscilloscopes are similar to old-fashioned televisions, but they only drew time series or parametric graphs and they only had one color — usually bright green. This function is easily mimicked by Twenty-first Century desktop, laptop, and tablet computers as well as smart phones. Oscilloscope applications often pay homage to their analog ancestors by using a green color scheme. A pure tone would look like a sine curve when graphed oscilloscope style.
|40 Hz pure tone||100 Hz pure tone||315 Hertz pure tone|
Music in its simplest form would be monotonic; that is, a sequence of single frequencies played one at a time. Monotonic music is dull and lifeless like a 1990s ringtone (worse than that even); like a 1970s digital watch alarm (now we're talking); like an LC circuit built by a college student in an introductory analog circuit class (so primitive). Real music is polytonic — a mixture of different frequencies played together in a manner that sounds harmonious. A sound composed of multiple frequencies like that produced by a musical instrument or the human voice would still be periodic, but would be more complex than just a simple sine curve.
|baritone voice counting||tenor voice singing||soprano voice singing|
|conga drum||high hat cymbal||woodblock|
The human voice and musical instruments produce sounds by vibration. What vibrates determines the type of instrument.
|bell, cymbal, musical saw,
wood block, xylophone
|drums, kazoo, human voice||chordophone||stretched
|strings (violin, guitar, harp), piano|
|woodwinds (saxophone, flute),
brass (trumpet, tuba), organ
Like many other mechanical systems, musical instruments vibrate naturally at several related frequencies called harmonics. (This is true even for purely electronic instruments like the theremin and the synthesizer, as their circuitry was designed intentionally to generate harmonics.) The lowest frequency of vibration, which is also usually the loudest, is called the fundamental. The higher frequency harmonics are called overtones. The human auditory system perceives the fundamental frequency of a musical note as the characteristic pitch of that note. The amplitudes of the overtones relative to the fundamental give the note its quality or timbre (pronounced tæmbər in English or even tɛ̃br with a quasi-French nasal e in the middle and a silent e at the end). Timbre is one of the features of sound that enables us to distinguish a flute from a violin and a tuba from a timpani.
Let's return to graphs of sound. It is possible to resolve a polytonic sound back into its component frequencies mathematically. The process is called spectral analysis or Fourier analysis after the French mathematician and physicist Joseph Fourier (1768–1830). Recall that when waves meet they don't collide like material objects, they pass through each other like spectres — they interfere. Interfering waves combine by the principle of linear superposition — basically, just add the values of one function to the values of another function everywhere in mathematical space. With the right combination of frequencies and amplitudes, you can make functions with all kinds of shapes (as long as they're functions) out of a harmonic series of sine and/or cosine functions. Several examples were illustrated in this book.
Sound waves can be combined this way as well. This process is called additive synthesis and it was the basis behind early analog synthesizers. Generate an electric signal of one frequency with one oscillator. Generate a multiple of the first frequency with a second oscillator. Add a third, fourth, fifth, sixth and seventh oscillator. Adjust the relative amplitude of each oscillator to produce the desired timbre. Synchronize them electronically so that when the pitch of the fundamental oscillator is changed, the frequencies of the overtone oscillators follow along. Attach the oscillator to a keyboard that controls the frequency of the fundamental oscillator and turns the oscillators on and off as needed. Make your keyboard look like a piano so musicians will use your synthesizer as a musical instrument. Let the Seventies begin.
Fourier analysis is the reverse process (since analysis is the reverse of synthesis). Start with an arbitrary function and assume that it was made from the linear superposition of an infinite number of sine and cosine functions in a harmonic series. What coefficients should each harmonic be multiplied by to give the desired function? The process for answering this question is called a Fourier transform and it is not an easy process to describe. Even the simplified version of this process, the fast Fourier transform (FFT), is hard to describe. Fortunately for this discussion, it seems to be quite easy to get a hold of code that can be dropped into all sorts of applications. Sound spectral analysis using the FFT has been cheap and easy to find online since the late 1990s.
The graph above shows the spectral analysis of a female voice. Sounds of all sorts of different frequencies are being produced — some more intensely than others. It appears that this particular voice was strongest at 270 Hz (C♯4 for those of you who know music) and multiples of 270 Hz (540 Hz, 810 Hz, 1080 Hz, 1350 Hz, 1620 Hz, and 1890 Hz). This graph has been normalized, which exaggerated the peaks at the higher frequencies so they could be seen. I like this graph because the spacing between peaks is just so perfect. This kind of structure is what makes a sound musical. Let's look at some more examples.
A flute is essentially a tube that is open at both ends. Air is blown across one end and sound comes out the other. A spectral analysis confirms this. The harmonics are all whole number multiples of the fundamental frequency (436 Hz, a slightly flat A4 — a bit lower in frequency than is normally acceptable). Note how the second harmonic is nearly as intense as the fundamental. This strong second harmonic is part of what makes a flute sound like a flute.
A recorder is also a tube with two open ends. It produces a sound similar to a flute, but not exactly the same. Again the harmonics are whole number multiples of the fundamental frequency (923 Hz, a very sharp A5 — much higher in frequency than is normally acceptable), but for some reason the second harmonic is nearly non existent. This nearly missing second harmonic is part of what makes a recorder sound like a recorder and not sound like a flute.
A tuning fork is forked; that is to say, it splits from its handle into two branches called tines. Each tine is fixed to the handle at one end, but is free to vibrate at the other. As a result, one would expect to find only those harmonics that were odd multiples of the fundamental in the spectrum of a tuning fork. This is what the spectral analysis shows. The even harmonics are present, but are they are extremely weak and are probably due to the sympathetic vibrations of something nearby. This spectra was produced by striking a large, demonstration-size tuning fork (not the one pictured above) with an excessively heavy blow. Tuning forks should always be tapped lightly and on a resilient surface. Doing so reduces the intensity of the "ping" overtones, which is a desirable thing. An ideal tuning fork would vibrate at just one frequency. The tuning fork used in this experiment was rated at 256 Hz, but the spectral analysis software picked up 259 Hz for the fundamental. Is the tuning fork out of tune or is the software in error?
Music is sound with a discrete structure. Noise is sound with a continuous structure. Music is composed of sounds with fundamental frequencies and overtones. Noise is composed of sounds with frequencies that range contuously in value from as low as you can hear to as high as you can hear (not all with equal intensity however). Music is described mathematically by an inifinite sum of sines and cosines mutiplied by appropriately valued coefficients (infinite mathematically, but in practice only a handful of overtones really matter). Noise is described by a spectral power distribution, much like the statistical distributions of kinetic moecular theory. Music is ordered. Noise is random.
|Spectral analysis of …|
|white noise||pink noise||concert applause|
|Frequency on the horizontal axis. Intensity on the vertical axis.|
Noise may be disorganized, but that does not mean it cannot be described. Frequency is to sound as color is to light. We hear different frequencies of sound as different pitches (A, B, C, D, E, F, G; for example). We see different frequencies of light as different colors (red, orange, yellow, green, blue, and violet; for example). The analogy is not perfect, however. The notes of the musical scale repeat themselves for every doubling of freqeuncy (a topic I'll get to soon). The frequencies of visible light span so narrow a range, they never get a chance to double. The frequencies of the notes of a musical scale are related by simple numerical fractions. The frequency bands associated with specific color names show no mathematical relationships. This anaolgy isn't looking very promising.
In much the same way that most sounds we hear are polytonic (composed of many frequencies), most light we see is polychromatic (composed of many frequencies). Sources of light that are hot enough will emit a mixture of frequencies that span the entire visible spectrum. The color of this light is white — a mixture of red, orange, yellow, green, blue, violet, and everything in between. White noise is the audio equivalent of white light. It's a combination of all the fruqencies that span the entire audible spectrum. White noise is a mixture of all the musical pitches with names and all the pitches in between those that aren't named. This is the informal definition of white noise.
Formally, white noise is a sound with a flat frequency spectrum. It transmits power equally at all frequencies. Something like this …
p(ƒ) =∝ constant
So that …
|p(ƒ) =||the value of the power spectral density [W/Hz] or a quantity proportional to this — the relative power spectral density [V2/Hz]|
|ƒ =||any frequency of sound in the range of human hearing [Hz]|
|P =||the total power output of the source of [W] or a quantity proportional to this — the relative total power output [unitless]|
Kinda tired. Finish this later.
The foundation of music is the musical note: a combination of pitch (the musical word for frequency) and duration (the amount of time).
Did I say music was based on notes? That's not true. Real music is based on intervals (the ratio of two notes) with high degrees of consonance (shared harmonics).
Helmholtz used the German word oberton which literally means "upper tone". Someone transliterated the word (instead of translating it) and oberton became overtone in English by accident. If Helmholtz had meant overtone he would have said überton.
Intervals are named by their size …
and character …
A musical scale is a set of notes arranged in increasing order of fundamental frequency. Since notes separated by an octave sound similar, a musical scale can be completely described by a set of intervals with ratios ranging from 1:1 to 2:1. If you need notes above the octave, just double the intervals. If you need notes higher than that, double the intervals again. If you need notes lower than the tonic, take all your intervals and halve them. If you need notes lower than that, halve all the intervals again. You get the idea. This is called octave duplication.
The intervals of adjacent notes in a scale often come in two sizes called whole tones and semitones (or half tones). The relationship between these should be obvious — two semitone intervals applied in succession equal a whole tone — but it turns out to be only approximately true in most cases. (An equal tempered scale is the only tuning system where two semitones are exactly equal to one whole tone.) Combinations of tones and semitones are also named by size …
Pythagoras of Samos (582–496 BCE) Greece was the first to try and describe music with a mathematical system called the circle of fifths (or cycle of fifths). Start with the tonic. Multiply by a perfect fifth (3⁄2). Do it again. This puts you into the next octave. Bring it down an octave (multiply by 1⁄2) so you can keep building your scale. Well …
3⁄2 × 1⁄2 = 3⁄4
is the inverse of 4⁄3, an interval with a great deal of consonance. When you completely build the scale, the ratio 4⁄3 turns out to be the fourth interval in the series of eight that make up an octave. Thus the name fourth. The fifth and the fourth are inversions of one another in an octave. They are the only intervals that work out this way. That makes them special, in my mind, but the adjective that was ascribed to them was perfect. Thus the intervals 4⁄3 and 3⁄2 are called the perfect fourth and perfect fifth, respectively.
So here's the plan again: start with the tonic, bring it up a perfect fifth, take it down a perfect fourth, and repeat until the ratio equals an octave (2⁄1).
We'll start on C since that's the middle of the modern piano. Behold!
|F: C =||(3/2)−1||(1/2)0||=||2||= 0.6666666666666…|
|C: C =||(3/2)0||(1/2)0||=||1||= 1 ← start here|
|G: C =||(3/2)1||(1/2)0||=||3||= 1.5|
|D: C =||(3/2)2||(1/2)1||=||9||= 1.125|
|A: C =||(3/2)3||(1/2)1||=||27||= 1.6875|
|E: C =||(3/2)4||(1/2)2||=||81||= 1.265625|
|B: C =||(3/2)5||(1/2)2||=||243||= 1.8984375|
|F♯: C =||(3/2)6||(1/2)3||=||729||= 1.423828125|
|C♯: C =||(3/2)7||(1/2)4||=||2187||= 1.06787109375|
|G♯: C =||(3/2)8||(1/2)4||=||6561||= 1.601806640625|
|D♯: C =||(3/2)9||(1/2)5||=||19683||= 1.2013549804688…|
|A♯: C =||(3/2)10||(1/2)5||=||59049||= 1.8020324707031…|
|E♯: C =||(3/2)11||(1/2)6||=||177147||= 1.3515243530273…|
|B♯: C =||(3/2)12||(1/2)6||=||531441||= 2.0272865295410…|
Oh oh. For those of you familiar with the piano, you will note that the errors occur at notes that do not exist on the keyboard. There is no such thing as a B♯ (a.k.a. A♭) or an E♯ (a.k.a. F♭).
|interval||pythagorean ratio||interval name|
|C♯||: C||1.06787109375||minor second|
|D||: C||1.125||major second|
|D♯||: C||1.2013549804688…||minor third|
|E||: C||1.265625||major third|
|F||: C||1.3333333333333…||perfect fourth|
|E♯||: C||1.3515243530273…||bigger than a perfect fourth|
|G||: C||1.5||perfect fifth|
|G♯||: C||1.601806640625||minor sixth|
|A||: C||1.6875||major sixth|
|A♯||: C||1.8020324707031…||minor seventh|
|B||: C||1.8984375||major seventh|
|B♯||: C||2.0272865295410…||bigger than an octave|
It should really be called the spiral of fifths since it never closes up. One complete lap around the circle should equal a whole number of octaves. That turns out to be 12 fifths and 7 octaves. But 12 fifths is a bit larger than seven octaves. This discrepancy is known as the pythagorean comma and is equal to …
|B♯||: C =||⎛
Thus B♯ is a bit higher than C by about one-quarter of a semitone. This is a small difference that would be audible to trained ears were the two notes to be played in succession. (Ordinary folks might not perceive the difference at all.) Play them together as a part of a chord and your ears would definitely not enjoy it.
|ƒB♯ − ƒC||= ƒbeat|
|(256 Hz)(1.013643264771 − 1)||= 3.5 Hz|
The dissonance would be audible as a 3.5 Hz beat for C = 256 Hz. No musician would ever want to be this far out of tune and no audience would want to listen to them.
Just intonation is a simple way to build a musical scale. It is the way groups of people sing together when they are unaccompanied by musical instruments (a cappella). Scales with just intonation are built on a series of intervals with a high degree of consonance between tones. There are many types of scales with just intonation. I will discuss a few of the basic ones. I am showing them on a piano keyboard since it is a familiar instrument, but pianos are not normally tuned this way. (Pianos use equal temperament, which will be described next.)
The scale shown below is a type of just intonation scale starting on C. Since the third interval is major, the entire scale or key is said to be major. Music performed in a major key tends to sound bright, happy, or triumphant. This is true whether the scale was constructed using just intonation or some other scheme. This is probably a function of culture more than anything else. (Happiness isn't something physicists go around measuring — although we do discuss brightness from time to time and in a different context.)
This particular scale is heptatonic, which means it contains seven intervals: three major whole tones (9/8), two minor whole tones (10/9), and two semitones (16/9). The ratios of adjacent notes in a major scale follow this order — major, minor, semi, major, minor, major, semi. It is also said to be diatonic — from the Greek phrase διατονικη (diatonike), which means "across the tones". The implication is that all the tones you need are contained in this scale. That's not quite right, however. (More on this later.)
Musical intervals are cyclic over an octave, so the sequence repeats as you go up to the next octave or down to the lower octave. There's no reason I have to start playing a scale on C like the diagrams above show. Why not start playing on A as the diagrams below show? This gets you through an octave with intervals that are still fairly nice. Since the third interval of this scale is minor, the entire scale is said to be minor. Music performed in a minor key tends to sound dark, sad, moody, or introspective. Why this happens is outside the realm of physics.
This scale is also heptatonic and diatonic. The ratios of adjacent notes in a minor scale follow this order this order — major, semi, major, minor, semi, major, minor.
In general, a diatonic scale is any scale that can be played on the white keys of a piano. Since there are seven intervals in a diatonic scale, there are seven notes to start out on on a piano (A, B, C, D, E, F, and G). Starting on C gives you the major diatonic scale. Starting on A gives you a minor diatonic scale (also called the natural minor scale). The remaining five scales are used to lesser degrees in Western music past and present — medieval Christian chant being the example most frequently cited. All seven diatonic scales (or modes as they are called) have names corresponding to regions in ancient Greece. The assignment of Greek place name to diatonic mode is completely arbitrary, however. If there ever was any meaning to this naming convention, it's been lost to time.
Let's return to the the C major scale. I find that one more interesting because of what happens when you invert the intervals. Instead of moving from the tonic to the note, let's move from the note to the octave.
To go from C to G on this scale is a jump up of a fifth (multiply by 3/2). From G to the next C is a jump of a fourth (multiply by 4/3). This means the fourth is the inversion of the fifth …
and the fifth is the inversion of the fourth (which sounds like a statement of the obvious) …
In general, any pair of intervals that together results in a jump of one octave is called an inversion.
That means a third and a sixth are an inversion …
and so are a sixth and a third (which should also be obvious) …
But wait a minute. What just happened? The first first time I wrote the third, I wrote 5/4 = 1.25. The second time I wrote 6/5 = 1.2. Last time I checked, those were different numbers. Welcome to the world of just intonation, where intervals in one direction on a scale do not necessarily equal those in the reverse direction.
The tonic, fourth, fifth and octave are not affected by inversion, which is why they are called perfect intervals. The fourth and fifth are given this as a title — as in, "a perfect fourth" or "a perfect fifth". For some reason no one says "a perfect tonic". Maybe because it sounds like the name of a cocktail. No one says "a perfect octave", either but I don't have a joke for that one.
The second, third, sixth, and seventh are made smaller by inversion. An interval used to raise a note above the octave is always larger than the same interval used to raise the new note to the octave. As is tradition, the larger intervals are said to be major and the smaller ones minor.
|tonic||1/1||= 1||octave||2/1||= 2|
|major second||9/8||= 1.125||minor seventh||16/9||= 1.777…|
|major third||5/4||= 1.25||minor sixth||8/5||= 1.6|
|perfect fourth||4/3||= 1.333…||perfect fifth||3/2||= 1.5|
|perfect fifth||3/2||= 1.5||perfect fourth||4/3||= 1.333…|
|major sixth||5/3||= 1.666…||minor third||6/5||= 1.2|
|major seventh||15/8||= 1.875||minor second||16/15||= 1.066…|
|octave||2/1||= 2||tonic||1/1||= 1|
Something interesting happens when you merge the perfect, major, and minor intervals together into a scale. The minor intervals fill in the "spaces" between notes separated by a whole tone. (The exception to this is the gap between the perfect fourth and the perfect fifth.) The resulting scale has twelve intervals, each one a semitone away from its neighbor. Such a scale is said to be chromatic — from χρωματος (chromatos), the Greek word for color. Since color is to vision as musical note is to hearing, the metaphor is entirely appropriate. The chromatic scale is the most colorful. It's the one with the most intervals.
|name||relation to tonic||relation to predecessor|
|minor second||16/15||= 1.066…|
|major second||9/8||= 1.125|
|minor third||6/5||= 1.2|
|major third||5/4||= 1.25|
|perfect fourth||4/3||= 1.333…|
|perfect fifth||3/2||= 1.5|
|minor sixth||8/5||= 1.6|
|major sixth||5/3||= 1.666…|
|minor seventh||16/9||= 1.777…|
|major seventh||15/8||= 1.875|
The problem (if that's the right word) is that not all the intervals are of the same size. 9:8 major whole tone, 10:9 minor whole tone, and 16:15 semitone. The 16:15 diatonic or just semitone appears between the third and fourth and the seventh and octave. It's called a semitone since it's approximately half the size of the whole intervals. But two half intervals do not equal a whole. In fact it's a little too big.
|two diatonic semitones||16||16||=||256||= 1.13778|
|one major whole tone||9||= 1.125|
|one minor whole tone||10||= 1.111 …|
A compromise value of 25:24 called the classic or small semitone (a minor whole tone 10:9 divided by diatonic semitone 16:15) comes closer, but as the name implies, it's a bit small.
|two small semitones||25||25||=||625||= 1.08507|
Sharps (♯) and flats (♭) are different as a result of …
If you don't like semitones (and why don't you?) you could eliminate them. Take the diatonic scale and drop the third and seventh notes (the ones at the bottom of the semitone intervals). This gives you a scale with five notes — a pentatonic scale.
[DIAGRAM OF PENTATONIC SCALE]
Ratios of in a Pentatonic Scale with Just Intonation (C = 256 Hz Scientific Scale)
Diatonic + Pentatonic = Chromatic
How about an octatonic, "string of pearls" scale?
What about microtonal scales? Nothing in this book.
Ratios of the Chromatic Intervals in a Scale with Equal Temperament
(A = 440 Hz American Standard Scale)
Less consonant (full of beats), but easier in terms of transposition.