Why is the sky blue?
A semi-detailed explanation
By Matt McIrvin
This is the original, linear version of my blue-sky explanation
(apart from cosmetic changes, and some recent small changes in
phrasing). The new version serves
up essentially the same content in smaller chunks linked to a
simpler overview, and may be more comprehensible to non-scientists;
choose according to taste. See "About
the blue-sky pages" for some history.
Contents
- The basic idea: dipole scattering
- Retarded potentials
- The potentials of an oscillating
dipole
- Radiation fields
- The sky, the sunset, and a Martian
postscript
No other question so strongly evokes images
of parents shrugging their shoulders in bewilderment when kids ask
it. (It isn't the champion in the blushing and stammering category,
but I believe it leads the pack in bewilderment.) Popular books on
science often simply explain that air molecules preferentially
scatter blue light from the sun, but stop there. I thought that it
might be interesting to provide a more detailed, but not
tremendously mathematical, explanation of why this is so.
1. The basic idea: dipole scattering
Light is an electromagnetic wave. If you stand in one spot as a
light wave passes by, there will be an oscillating electric field
and an oscillating magnetic field, which are perpendicular to each
other. If the light is in the range of frequencies that we can see,
then the frequency of the vibration affects the color of the light.
The color-vision receptors in our eyes, the cones, are of three
types: "blue" receptors that respond to light over a broad range of
high frequencies, "green" receptors that respond to medium
frequencies, and "red" receptors that respond to low frequencies.
The ranges of sensitivity of the receptors overlap considerably,
but they have their maximum sensitivities at different frequencies.
The perceived color depends (among other things) on the relative
strengths of the signals from these receptors.
Molecules are usually electrically neutral, but they are made of
charged objects: their atoms consist of negatively charged
electrons and positively charged nuclei. If there is an electric
field at the position of an atom, the nucleus will move a short
distance in the direction of the field and the electrons will move
the other way, and the atom will become a "dipole": the positive
and negative charge will be centered around different places. A
molecule made of such atoms will acquire its own electric
field, something like the magnetic field of a bar magnet.
A dipole's electric field falls off more rapidly with distance
than it would if the molecule had a net electric charge. This is
because at large distances, the fields from the positive and the
negative charge tend to cancel each other out, as the difference
between their average positions becomes less important.
However, if the dipole is made to oscillate-- that is, if the
positive and negative charge wiggle back and forth, out of phase
with each other-- then the molecule can produce electromagnetic
radiation of its own, for reasons I'll explain below. This is how
air molecules scatter light: the oscillating electric field of the
incoming wave makes the molecules develop oscillating dipoles,
which in turn give off radiation.
The radiation destructively interferes with the incoming wave in
the forward direction. The original wave is lessened in intensity,
and new waves move out in all other directions, so that overall
energy is conserved (this requirement is sometimes called the
"optical theorem"). The net effect is that light energy that was
moving in a straight line from the sun ends up traveling in some
other direction.
Since sunlight appears white but the sky is a robin's-egg blue,
it must be that the scattered light excites our blue-sensing cones
more, and our red-sensing cones less, than the original sunlight.
The distribution of frequencies in the scattered light must be
biased toward high frequencies. Why is this?
Contents
2. Retarded potentials
Scalar and vector potentials
In the theory of electromagnetic radiation, it is not so
convenient to work with the electric and magnetic fields directly,
except for simple plane waves. It is more convenient to use the
"scalar potential" and "vector potential."
You are probably already familiar with the scalar potential: in
many situations, it is just the same thing as voltage. A
5-volt battery has a scalar potential difference of 5 volts between
its terminals. The electric field, in static situations (given the
usual potential conventions of electrostatics), is just given by
the spatial rate of change of the scalar potential, and it
points "downhill" toward regions of lower electric potential.
There is also a "vector potential" that has to do with
magnetism. This is a quantity with a magnitude and a direction: a
vector. In static situations, the magnetic field is related in a
somewhat complicated way to the rates of change of the vector
potential in various directions: essentially, it has to do with the
extent to which the vector potential swirls around a given
point.
If the potentials are changing with time, as in radiation, then
the relation between the potentials and the fields is more
complicated. But in either case, in size, the electric and magnetic
fields are proportional to the rates of change of the potentials in
space and time.
Potentials, charges, and currents
Now, if the potentials are defined in a certain way (what the
pros will recognize as a "covariant gauge"), the potential due to a
certain charge and current distribution is related to the charges
and currents in an extremely simple way.
Suppose there is a point charge somewhere in space, which moves
around. Then the scalar potential at some other place is directly
proportional to the charge, and inversely proportional to
the distance to the charge.
But it is not the distance to the place where the charge is
now; it is the distance to the place where the charge
was, at such a time that a signal traveling at the speed
of light from the position of the charge is just now getting to the
place where we're calculating the potential. The news about where
the particle is travels at a finite speed, the speed of light. This
is called a "retarded potential," meaning "delayed," because it
responds to the charge's position with a speed-of-light delay.
If there is more than just a point charge, then the scalar
potential can be calculated by adding up the retarded potential of
each little bit of charge.
The vector potential is related in exactly the same way to the
currents. Each little piece of current creates a retarded
vector potential that is proportional to current and inversely
proportional to distance, and the news about where the current is
travels at the speed of light.
Contents
3. The potentials of an oscillating dipole
Now consider a molecule that is a dipole. For simplicity, model
the molecular dipole as a pair of opposite point charges, separated
by a short distance. (Really, the positive charge consists of a
couple of nuclei and the negative charge is a spread-out cloud of
electrons, and the dipole comes from the separation between their
average positions; but idealizing the molecule as a pair
of point charges doesn't change the analysis in any substantive
way, as long as the molecule is small.)
If the dipole is not changing, then at large distances, the
scalar potential due to one end of the dipole and the scalar
potential due to the other end will tend to cancel each other out,
since the distance to the two charges is almost the same. So the
scalar potential will fall off faster with distance than it does
for a single charge.
But the news about the charge only travels at the speed of
light! If we are slightly closer to one end of the dipole than to
the other, then the potential here depends on the charge at the
near end of the dipole at some previous time, and the charge at the
far end of the dipole a short time before that. So if the
charges are moving back and forth at a high speed, the cancellation
between the ends of the dipole will be less complete. For instance,
the scalar potential here could depend on the charge at the near
end at a time when it was positive, but the charge at the
far end at a time when the negative charge had not yet
gotten all the way there.
If the dipole is much smaller than the wavelength of the light
(and air molecules are thousands of times smaller than the
wavelengths of visible light), the cancellation becomes linearly
less complete as the frequency of the oscillation increases. So at
large distances, where the scalar potential of the static dipole
would be negligible, the scalar potential due to an
oscillating dipole goes up linearly with the
frequency.
How about the vector potential? That's easier to figure out. It
also varies linearly with frequency, because it's proportional to
the current-- and the faster the charges are moving, the
more current there is.
The potentials that are produced reverse direction as the dipole
reverses direction. If the dipole wiggles back and forth, then
oscillating waves of potentials move out from the dipole at the
speed of light, with a strength proportional to the frequency of
the wiggle. The higher the frequency, the shorter the waves,
because they have less time to get out of the way before the dipole
changes direction.
Contents
4. Radiation fields
Now, the electric and magnetic fields are proportional to
various rates of change of the potentials, in space and time. They
get a factor of frequency from the sizes of the potentials; but
they also get another factor of the frequency from the fact that
the shorter a wave is, the faster it varies in space; and the
higher its frequency is, the faster it varies in time. So the
fields are proportional to the square of the
frequency.
But we are not done yet! The important thing is how much
power is transmitted by the wave, and that is proportional
to the product of the electric field and the magnetic field. So the
power density in the wave goes up as the fourth power of
the frequency.
Therefore, the spectrum of the radiated light, and the scattered
light from an induced dipole, will be very strongly peaked at high
frequencies, or short wavelengths. There are things I have
neglected here, such as the fact that sometimes, there are resonant
frequencies at which the charge oscillates particularly strongly
when driven by an oscillating field. These frequencies are
determined by the quantum mechanics of the molecule. However, in
this particular case, the effect of resonance is not a major
contributor at visible wavelengths.
A full analysis would also take into account the fact that the
electromagnetic field is quantized; the energy comes in photons.
But that turns out not to affect the fourth-power dependence of the
spectrum on frequency.
This sort of scattering is called Rayleigh scattering, after
Lord Rayleigh, who first worked it out for a very small classical
dipole.
Contents
5. The sky, the sunset, and a Martian
postscript
If the dipole is a molecular dipole created by an
electromagnetic wave from the sun impinging on an air molecule,
then it is the higher frequencies that will be primarily scattered
in different directions, and removed from the incoming wave. Lower
frequencies will be scattered as well, but not as much; the
scattered power goes like the fourth power of the frequency.
The atmosphere does not absorb much light at visible
wavelengths; the dominant effect is dipole scattering. The
scattered light will be biased toward the high frequencies. If you
look at a part of the sky where the sun is not, then your eyes are
receiving the scattered light. That light excites the blue- and
green- sensing cones in your retinas much more than the red-sensing
cones (the largest amount of power is coming in at the frequencies
covered by the blue cones, but they are less sensitive than the
green ones). The result is the beautiful turquoise color of a clear
sky.
(A previous version of this page claimed that it was also
responsible for the color of the Earth seen from space, but
Phil Plait argues that the dominant effect there is the
greater absorption of long wavelengths by the ocean.)
The scattering is also responsible for the color of the sun at
sunset. Since the emitted radiation removes energy from the
incoming waves by destructive interference (thereby conserving the
overall energy), the higher frequencies are preferentially depleted
from the unscattered beam. At sunset, the sun is shining at a
grazing angle through an unusually thick layer of air, so the
depletion is particularly pronounced, and the sun appears yellow,
orange or red rather than the usual blinding white.
Notice that this argument depends very little on the composition
of the atmosphere. Any clear atmosphere of more or less Earthlike
size and density, lit by a sun whose light appears more or less
white, would result in a blue sky.
The color pictures from Mars Pathfinder are a spectacular
reminder that the sky is not blue on Mars. Instead, it has
colors that have been described as everything from "orange-pink" to
"gray-tan", as was discovered in the 1970s by the Viking landers.
This is because the atmosphere of Mars is very thin and dusty, and
atmospheric light scattering is dominated not by the molecules of
gas (in the case of Mars, mostly carbon dioxide) but by suspended
dust particles. These are larger than the wavelengths of visible
light, and they are reddened by iron oxide, like Martian soil. It's
not just Rayleigh scattering, so the power spectrum is
different.