\documentclass[12pt]{report}

\usepackage{amssymb}
\usepackage{epsf}

\newcommand{\ii}{\mathord{\mbox{\boldmath $i$}}}

\newcommand{\si}{\mathord{\mbox{\boldmath\scriptsize $i$}}}

\newcommand{\conj}[1]{\overline{#1}}

\newcommand{\abs}[1]{\mathopen{|}#1\mathclose{|}}

\newcommand{\cconst}[1]{\mathord{\mbox{\boldmath $#1$}}}

\newcommand{\scconst}[1]{\mathord{\mbox{\boldmath\scriptsize $#1$}}}

\newcommand{\setsep}{:\;}

\newcommand{\discr}{\mathord{{\cal D}}}

\newcommand{\transf}[1]{\mathord{\cal #1}}

\newcommand{\filter}[1]{\mathord{\cal #1}}

\newcommand{\helix}{\xi}

\newcommand{\click}{\kappa}

\newcommand{\shah}{\mathord{\mbox{III}}}

\newcommand{\constfun}[1]{\mathord{\mbox{\boldmath #1}}}

\newcommand{\sinc}{\mathord{\mbox{sinc}}}

\newcommand{\fromto}[2]{#1\rightarrowtail#2}

\newcommand{\ccivl}[1]{[#1]}

\newcommand{\coivl}[1]{[#1)}

\newcommand{\ocivl}[1]{(#1]}

\newcommand{\ooivl}[1]{(#1)}

\newcommand{\compose}{\circ}

\begin{document}

\title{Digital Sound Modelling\\
  lecture notes for Com Sci 295}
\author{Michael J.\ O'Donnell}
\date{10 March 1994---revised 24 March 1996}

\maketitle

\pagenumbering{roman}

\tableofcontents

\listoffigures

For supplementary material, particularly better versions of graphs,
see the associated Maple files.

\chapter{Physical and Mathematical Foundations of Sound Modelling}
\label{cha:phys-math-foundations}

\pagenumbering{arabic}

In order to have useful discussions about sound, we need a very
simplistic, but practical, understanding of the physics and
mathematics associated with sound.

\section{Physics: What Is Sound?}

For our purposes, sound is any kind of vibration that is detectible by
the ear or devices analogous to the ear. Treatments of sound in
physics books tend to focus attention on the transmission of sound
vibrations through the air. We will focus instead on the vibrating
systems that produce and detect sounds, and just assume that the air
is capable of transmitting vibrations from sound producers to the
detectors in the ear.

\subsection{Vibrating Springs}
\label{sec:springs}

The simplest sort of vibration to understand is that of a spring. To
really simplify things, imagine an environment with no gravity, and
with a mass (a solid chunk of something) moving along a frictionless
track that is fixed so the track cannot move. The track constrains
motion of the mass to a straight line, so we do not need to consider
the three dimensions of space. Finally, imagine a spring attached at
one end to the mass, and at the other end to some fixed point on the
track. Be a bit liberal-minded, and imagine that the spring has length
$0$ when it is not stretched, and that the mass can move freely past
the point where the spring is attached. A picture of our imaginary
system is given in Figure~\ref{fig:ideal-spring-resonator}.
\begin{figure}
\centerline{\hbox{\epsfbox{spring-reson.eps}}}
\caption{\label{fig:ideal-spring-resonator}Ideal spring resonator}
\end{figure}

At any moment in time, the state of the spring system can be described
by two real numbers: the {\em displacement\/} $y$ of the mass to the
right of the point on the track to which the spring is fixed, and the
{\em velocity\/} $v$ of the mass to the right. Displacement to the
left is represented by negative values of $y$, and motion to the left
is represented by negative values of $v$. Now, imagine that we
displace the mass to the right and hold it in a fixed position,
stretching the spring. That is, we establish an initial condition
where $y>0$ and $v=0$. When we release the mass, the spring pulls it
to the left, causing a state where $y>0$ and $v<0$. Eventually the
mass reaches the center of the track at $y=0$, but at this moment
$v<0$ and inertia carries the mass beyond the center, to the left
where $y<0$. Now, the spring pulls the mass to the right, cancelling
out the motion $v<0$ to the left. Eventually the mass stops with
$v=0$, but at this moment the spring is stretched to the left with
$y<0$, so the pull to the right continues and causes the mass to move
right with $v>0$. This motion to the right eventually moves the mass
past the center, so $y>0$. The leftward pull of the spring opposes the
motion until $v=0$. So we return to a condition that is similar to the
initial one: $y>0$ and $v=0$, and the cycle repeats.
Figure~\ref{fig:qualitative-spring-states} shows a schematic
qualitative view of the vibration of the spring.
\begin{figure}
\centerline{\hbox{\epsfbox{qual-spring-states.eps}}}
\caption{\label{fig:qualitative-spring-states}Qualitative states of spring}
\end{figure}

To complete the simplistic physics of a vibrating spring, we need to
convert the qualitative observations above into quantitative
information that we can use in a mathematical analysis. For this
purpose, let $t$ be a real number representing the {\em time\/} that
has passed since some arbitrary starting moment when $t=0$. For any
quantity $q$ that depends on time, $dq/dt$ means the instantaneous
rate of change of $q$ with respect to $t$---when the independent
variable $t$ is understood from context, $dq/dt$ is often abbreviated
$q'$. When no outside force acts on the spring and mass, its behavior
is described by the following two equations:
\begin{eqnarray}
\label{equ:dy-v} y' & = & Av \\
\label{equ:dv-y} v' & = & -By
\end{eqnarray}
$A$ and $B$ are positive real number constants (independent of
time)---their actual values do not matter to
us. Equation~\ref{equ:dy-v} holds because velocity is defined to be
the change of location over time, and displacement is just a measure
of location from a particular origin---the constant $A$ takes care of
any conversion of units between $y'$ and $v$ (normally the units are
the same and $A=1$). Equation~\ref{equ:dv-y} represents the fact that
the force exerted by a spring increases in magnitude proportionally to
the distance that the spring is stretched, and the force acts to pull
the ends of the spring together. The value of $B$ is determined by the
stiffness of the spring. Equation~\ref{equ:dv-y} is an approximation,
because no real spring exerts a force precisely proportional to the
stretching distance---in particular when a spring is stretched too far
it changes radically, becoming stiffer, or becoming softer, or
breaking, depending on its construction. The right practical approach
to understanding vibration is to do as much analysis as possible based
on the simple approximate equations above, and then do the potentially
complicated corrections only when greater accuracy is required.

Vibrating objects that produce sound, and others (such as the hairs in
the cochlea of the ear) that detect sound, can be modelled fairly well
by systems of vibrating springs connected together in various ways.
Other vibrating systems have other physical parameters that measure
the vibrating behavior, but in most cases there are two real
numbers---for example {\em pressure\/} and {\em flow\/} of vibrating
air, {\em potential\/} and {\em current\/} of vibrating electrical
charge---that behave analogously to displacement and velocity in a
vibrating spring.

\section{Mathematics: How Do We Model Sound?}

The key to understanding the mathematical analysis of sound is to {\em
visualize\/} the mathematics using graphs and geometric diagrams. The
right way to visualize the mathematics does {\em not\/} look like the
physical system of vibrating springs or other objects that it is
describing. The value of the mathematics is to give us a {\em
different\/} way of visualizing sound, that is much more convenient
for analytic reasoning than the actual physical configuration of
vibrating objects. Mathematically, the important properties of a
vibrating spring are just Equations \ref{equ:dy-v} and \ref{equ:dv-y}.
We can forget that they arose from the physical properties of a
spring, and just consider the numerical behavior of two real numbers
$x$ and $y$ as functions of $t$, when they satisfy the equations.
\begin{eqnarray}
\label{equ:dy-x} y' & = & Ax \\
\label{equ:dx-y} x' & = & -By
\end{eqnarray}
From now on, lower case Roman variables, such as $x$ and $y$, stand for
real numbers that are functions of a time parameter $t$. Upper case
Roman variables, such as $A$ and $B$, stand for real number constants,
which are the same as unvarying functions of time (but widely used
notations, such as $e=2.71828\ldots$, are left alone). Occasionally, we
will use the form $x(t)$ to denote the value of a function $x$ at a
particular time $t$, but usually we will refer to entire functions
rather than individual values. When an expression $\alpha(t)$ containing
an independent variable, such as $t$, should refer to an entire
function, rather than a single value of the function, we write
$[t]\alpha(t)$.

\subsection{Vibration as the Circular Movement of a Rotor}

To visualize all the possible states of a vibrating system, consider a
plane in which the horizontal axis gives the value of $x$ and the
vertical axis gives the value of $y$---in this way each possible state
of the system is a point in the plane.

First, consider the simple case where $B=A$, so Equations
\ref{equ:dy-x} and \ref{equ:dx-y} specialize to
\begin{eqnarray}
\label{equ:dy-xA} y' & = & Ax \\
\label{equ:dx-yA} x' & = & -Ay
\end{eqnarray}
Figure~\ref{fig:rotor-1}(a) shows an example point $\langle x,y\rangle$
and the corresponding point $\langle x',y'\rangle=\langle
-Ay,Ax\rangle$ as vectors in the plane, when $A=3/8$. Notice that the
angle between these two vectors is always a right angle.
\begin{figure}
\centerline{\hbox{\epsfbox{rotor-1.eps}}}
\caption{\label{fig:rotor-1}State and change vectors for vibrating spring}
\end{figure}
Since $\langle x',y'\rangle$ represents a {\em change\/} in $\langle
x,y\rangle$ it is useful to displace the origin of the vector
representing $\langle x',y'\rangle$ to the end of the vector
representing $\langle x,y\rangle$, as shown in
Figure~\ref{fig:rotor-1}(b). Now, it is easy to see that the state of
the system must trace out a circle in the plane centered about the
origin, because the direction of change is always at right angles to
the state vector. The size of the circle can be any nonnegative real
number---setting the size corresponds to providing an initial
displacement to the mass on the spring. Furthermore, since the
magnitude of the state vector always stays the same, the magnitude of
the change vector is also always the same ($A$ times the magnitude of
the state vector), so the state moves around the circle at a constant
speed. I call such a system with a point moving around a circle at a
constant speed a {\em rotor.} The time required for one full rotation
is the {\em period\/} $P$ of the rotor. The number of full rotations
in a unit of time is the {\em frequency\/} of the rotor: its value is
$1/P$. The magnitude of the state vector is the {\em amplitude\/} of
the rotor. The angle of the state vector with respect to the $x$ axis
($\langle 1,0\rangle$) at time $t=0$ is the {\em phase\/} of the
rotor.

Take 5 minutes to visualize the relationship between the rotor and the
vibrating spring system, as suggested in
Figure~\ref{fig:rotor-spring}. Notice that we have no interest in
actual physical devices that look like rotors---the rotor is purely a
mathematical concept that allows us to analyze the behavior of a
vibrating system. Now forget about springs, and always visualize
vibration in terms of rotors and similar mathematical systems that we
investigate later.
\begin{figure}
\centerline{\hbox{\epsfbox{rotor-spring.eps}}}
\caption{\label{fig:rotor-spring}Spring system vs.\ rotor}
\end{figure}

While the speed of a rotor state around its circular path is constant,
the $x$ and $y$ components of the rotor state oscillate sinusoidally.
Consider a rotor with amplitude $R$ (that is, the circular path has
radius $R$) and frequency $F$ ($F$ full rotations per unit time),
starting at time $t=0$ in state $\langle x,y\rangle=\langle
R,0\rangle$. The values of $x$ and $y$ at any time are given by the
trignometric $\cos$ and $\sin$ functions.
\begin{eqnarray}
\label{equ:x-cos}x & = & R\cos(2\pi Ft) \\
\label{equ:y-sin}y & = & R\sin(2\pi Ft)
\end{eqnarray}
The multiplication by $2\pi$ is required because we measure angles in
{\em radians,} and one full rotation is $2\pi$ radians. Notice that
the maximum (minimum) values for $x$ and $y$ are both $R$ ($-R$), and
each reaches its maximum and minimum when the other is $0$.
Figure~\ref{fig:cos-sin} shows $x$ (solid line) and $y$ (dashed line)
as functions of time $t$ for a rotor with a frequency of $1/4$
rotation per unit time.
\begin{figure}
\centerline{\hbox{\epsfbox{cos-sin.eps}}}
$$x=\cos(2\pi t/4)\mbox{ solid line}$$
$$y=\sin(2\pi t/4)\mbox{ dashed line}$$
\caption{\label{fig:cos-sin}
Rotor state parameters $x$ and $y$ as functions of time $t$}
\end{figure}
Figure~\ref{fig:helix} shows a three-dimensional plot
of $x$, $y$, and $t$. The path of the state is a helix, circling about
the $t$ axis. Think of the helix as the trace of a point running
around the circle from Figure~\ref{fig:rotor-1}.
\begin{figure}
\centerline{\hbox{\epsfbox{helix.eps}}}
$$\begin{array}{cl}
t & \mbox{breadth axis} \\
x=\cos(2\pi t/4) & \mbox{depth axis} \\
y=\sin(2\pi t/4) & \mbox{vertical axis}
\end{array}$$
\caption{\label{fig:helix}
Rotor parameters $x$ and $y$ vs.\ $t$ as a helix in three dimensions}
\end{figure}

When $A\neq B$ in Equations~\ref{equ:dy-x} and~\ref{equ:dx-y}, the
state vector traces out an ellipse, whose aspect ratio is
$\sqrt{A/B}$. The speed of the state vector around the ellipse is not
constant (but the period and frequency are still well
defined). Instead of figuring out a detailed description of an
elliptical rotor, notice that we can always normalize a rotor to have
circular motion, by changing the units in which $x$ and $y$ are
measured. In an elliptical rotor with frequency $F$, starting at time
$t=0$ in state $\langle x,y\rangle=\langle R_x,0\rangle$ and crossing
the $y$ axis in state $\langle x,y\rangle=\langle 0,R_y\rangle$, the
values of $x$ and $y$ at any time are still given by the $\cos$ and
$\sin$ functions, but with different scaling factors for each.
\begin{eqnarray}
x & = & R_x\cos(2\pi Ft) \\
y & = & R_y\sin(2\pi Ft)
\end{eqnarray}
In this case, the maximum (minimum) value for $x$ is $R_x$ ($-R_x$),
and for $y$ it is $R_y$ ($-R_y$). As before, each parameter reaches
its maximum and minimum when the other is $0$.

\subsection{Rotor State as a Complex Number}

It is mathematically convenient to think of the two-dimensional rotor
state vector $\langle x,y\rangle$ as a single complex number $x+\ii
y$, where $\ii$ is the ``imaginary'' number defined to be the
principal square root of $-1$ (if you read engineering books and
articles, you may see this number written as $\mbox{\boldmath $j$}$
instead of $\ii$). Do {\em not\/} look for deep significance in the
names ``real number,'' ``imaginary number,'' ``complex number.'' These
names are just tags made up by mathematicians---``real'' numbers are
no more {\em real\/} than other numbers, ``imaginary'' numbers are no
more {\em imaginary,} and ``complex'' numbers are used to {\em
simplify\/} a lot of the analysis that we need to do. For our
purposes, the complex number $x+\ii y$ is just a particular notation
for the vector $\langle x,y\rangle$, which is particularly convenient
because the familiar operations of addition, multiplication, and
exponentiation on the real numbers extend very naturally to operations
on complex numbers that are just right for analyzing vibration.

\subsubsection{Review of Complex Arithmetic}
\label{sec:complex-arith}

From now on, we use Greek letters $\alpha$, $\beta$, $\gamma$, etc.\
as variables ranging over complex number functions depending on the
time variable $t$. Complex number constants independent of time are
denoted by bold face Greek letters $\cconst{\alpha}$,
$\cconst{\beta}$, $\cconst{\gamma}$, etc. (but widely used notations,
such as $\pi=3.14159\ldots$ are left alone). It is important to be
fluent in the following facts about complex numbers, and to be able to
do complex arithmetic and algebra just as easily as you learned to do
real arithmetic and algebra in calculus class. Make sure that you {\em
visualize\/} each of the facts below in terms of vectors in the plane.

\paragraph{Cartesian form of complex numbers}
\begin{equation}
x_1+\ii y_1=x_2+\ii y_2\mbox{ if and only if }x_1=x_2\mbox{ and }y_1=y_2
\end{equation}

Addition and multiplication extend to complex numbers by using the
commutative, associative, and distributive laws, and the fact that
$\ii\ii=\ii^2=-1$. Addition of complex numbers may be visualized in
terms of the vectors represented by the two numbers: shift the origin
of one vector to the head of the other vector as shown in
Figure~\ref{fig:complex-addition}.
\begin{figure}
\centerline{\hbox{\epsfbox{complex-addition.eps}}}
\caption{\label{fig:complex-addition}Adding two complex numbers}
\end{figure}
The conjugate of a complex number, written $\conj{\alpha}$, is the
reflection of $\alpha$ through the real axis, as shown in
Figure~\ref{fig:complex-conjugate}.
\begin{figure}
\centerline{\hbox{\epsfbox{complex-conjugate.eps}}}
\caption{\label{fig:complex-conjugate}The conjugate of a complex number}
\end{figure}

\begin{eqnarray}
(x_1+\ii y_1) + (x_2+\ii y_2) & = & (x_1+x_2) +\ii(y_1+y_2) \\
(x_1+\ii y_1)(x_2+\ii y_2) & = & (x_1x_2-y_1y_2)+\ii(x_1y_2+x_2y_1) \\
\conj{x_1+\ii y_1} & = & x_1-\ii y_1 \\
\alpha+\beta & = & \beta+\alpha \\
(\alpha+\beta)+\gamma & = & \alpha+(\beta+\gamma) \\
0+\alpha & = & \alpha \\
\alpha\beta & = & \beta\alpha \\
(\alpha\beta)\gamma & = & \alpha(\beta\gamma) \\
1\alpha & = & \alpha \\
0\alpha & = & 0 \\
\alpha(\beta+\gamma) & = & \alpha\beta+\alpha\gamma
\end{eqnarray}

The \emph{real} and \emph{imaginary} parts of a complex number are
defined to select out the two components of the vector.

\begin{eqnarray}
\Re(x+\ii y) & = & x \\
\Im(x+\ii y) & = & y \\
\Re(\alpha+\beta) & = & \Re(\alpha)+\Re(\beta) \\
\Im(\alpha)+\Im(\beta) & = & \Im(\alpha)+\Im(\beta) \\
\Re(\alpha\beta) & = & \Re(\alpha)\Re(\beta)-\Im(\alpha)\Im(\beta) \\
\Im(\alpha\beta) & = & \Re(\alpha)\Im(\beta)+\Im(\alpha)\Re(\beta) \\
\alpha & = & \Re(\alpha)+\ii\Im(\alpha)
\end{eqnarray}
\begin{equation}
\alpha=\beta\mbox{ if and only if }\Re(\alpha)=\Re(\beta)\mbox{ and
}\Im(\alpha)=\Im(\beta)
\end{equation}
$\Re(\alpha)$ and $\Im(\alpha)$ are called the {\em Cartesian\/}
coordinates of the complex number $\alpha$.

\paragraph{Polar form of complex numbers.}
The reason why complex numbers are particularly convenient for
analyzing vibration is that they may be manipulated according to the
{\em magnitude\/} (length of the vector) and {\em argument\/} (angle
of the vector with respect to $1$) as well. A magnitude is just a real
number $\geq 0$, representing the length of a vector. Angles are a bit
trickier.

\subparagraph{Rotational and directional angles.}
There are really two connected but different concepts that are both
called ``angles.''  First, there are {\em rotational angles\/} that
measure an amount of rotation. A rotational angle may be any real
number---positive numbers represent counterclockwise rotation, and
negative numbers represent clockwise rotation. A rotational angle of
$2\pi$ represents a full rotation counterclockwise. Even though the
direction that an object points after a full rotation is the same as
before the rotation, $2\pi$ represents a different rotation than $0$
or $-2\pi$ or $4\pi$---suppose for example that we are measuring
rotation of a wheel that winds up a spring.

The other sorts of angles are {\em directional angles\/} that measure
the direction that a vector is pointing with reference to some
conventional $0$ direction (for complex numbers, $0$ is the
directional angle of the vector represented by $1$). Directional
angles must be in the half-open interval $[0,2\pi)$. Many books and
articles prefer to describe directional angles in the interval
$(-\pi,\pi]$ (so, for example, the angle $3\pi/2$ in our notation
becomes $-\pi/2$). It makes no essential difference which interval is
used, since all arithmetic on directional angles is done on a circle
of circumference $2\pi$, rather than the usual real line. We may
convert rotational angles to directional angles with the function
$\bmod 2\pi$.
\begin{eqnarray}
x\bmod 2\pi & = & x-2\pi\lfloor x/(2\pi)\rfloor
\end{eqnarray}
\begin{equation}
0\leq x\bmod 2\pi<2\pi
\end{equation}
When $x\bmod z=y\bmod z$ we often write $x=y\pmod{z}$ instead. This
form suggests an alternate view of modular arithmetic: $x=y\pmod{z}$
means that $x$ and $y$ are two names for the same thing in the
$\pmod{z}$ universe, even though they may be different numbers in the
usual real number universe. If we apply a rotational angle $x$ to
rotate a vector from a starting position with directional angle $0$,
we get a new vector with directional angle $x\bmod 2\pi$. Notice that
all rotational angles $x+2k\pi$ for integers $k$ correspond to the
same directional angle. Given a directional angle $x$ resulting from a
rotation, there is no way to tell which of the infinitely many
possible rotational angles generated $x$. To avoid becoming confused
by the ambiguity in the word ``angle,'' visualize each angle as either
an amount of rotation or a static direction, instead of a pure
abstract real number.

The angle of a complex number is a directional angle, so it is
restricted to the interval $[0,2\pi)$.
\begin{eqnarray}
|x+\ii y| & = & \sqrt{x^2+y^2} \\
\label{equ:arg-arctan}\arg(x+\ii y) & = & \arctan(y/x)\bmod 2\pi \\
|\alpha| & \geq & 0 \\
\arg(\alpha) & \geq & 0 \\
\arg(\alpha) & < & 2\pi \\
\end{eqnarray}
\begin{equation}
\label{equ:equal-mag-angle}\alpha=\beta\mbox{ if and only if }
|\alpha|=|\beta|\mbox{ and }\arg(\alpha)\arg(\beta)
\end{equation}
$\arg(0)$ is undefined, since it makes no sense to take the angle of a
vector with magnitude $0$. But, we let $\arg(\ii y)=\pi/2$ for $y>0$
and $\arg(\ii y)=3\pi/2$ for $y<0$ in spite of the division by $0$ in
Equation~\ref{equ:arg-arctan}, since $\ii$ and $-\ii$ are clearly at a
right angles to the real axis (notice that
$\lim_{z\rightarrow\infty}\arctan(z)=\pi/2$,
$\lim_{z\rightarrow-\infty}\arctan(z)+2\pi=-\pi/2+2\pi=3\pi/2$).
Figure~\ref{fig:complex-mag-angle} shows the relation between
$\Re(\alpha)$, $\Im(\alpha)$, $|\alpha|$, and $\arg(\alpha)$ when
$\alpha$ is drawn as a vector in a two-dimensional space.
\begin{figure}
\centerline{\hbox{\epsfbox{complex-mag-angle.eps}}}
\caption{\label{fig:complex-mag-angle}
Complex number: Cartesian and polar coordinates}
\end{figure}
$|\alpha|$ and $\arg(\alpha)$ are called the {\em polar\/} coordinates
of the complex number $\alpha$. Addition of complex numbers is easiest
to do by manipulating the real and imaginary parts, but multiplication
and division may be defined very nicely on the magnitude and angle.

\begin{eqnarray}
\label{equ:mult-mag}|\alpha\beta| & = & |\alpha||\beta| \\
\label{equ:mult-add-angle}
\arg(\alpha\beta) & = & \arg(\alpha)+\arg(\beta)\bmod 2\pi \\
|\alpha/\beta| & = & |\alpha|/|\beta| \\
\arg(\alpha/\beta) & = & \arg(\alpha)-\arg(\beta)\bmod 2\pi \\
|-\alpha| & = & |\alpha| \\
\arg(-\alpha) & = & \arg(\alpha)+\pi\bmod 2\pi \\
|x\alpha|=|-x\alpha| & = & x|\alpha|\mbox{ for }x\geq 0 \\
\arg(x\alpha) & = & \arg(\alpha)\mbox{ for }x>0 \\
|\conj{\alpha}| & = & |\alpha| \\
\arg(\conj{\alpha}) & = & -\arg(\alpha)\bmod 2\pi \\
\alpha & = & |\alpha|(\cos(\arg(\alpha))+\ii\sin(\arg(\alpha)))
\end{eqnarray}
Using Equations~\ref{equ:mult-mag} and~\ref{equ:mult-add-angle}, we
see that a complex number $\alpha$ of magnitude $1$ acts as a rotator:
the multiplication $\alpha\beta$ rotates $\beta$ by the angle
$\arg(\alpha)$. In particular, multiplication by $\ii$ rotates a
vector counterclockwise by $\pi/2$ (right angle). So, letting the
single complex number $\rho=x+\ii y$ represent the rotor state
$\langle x,y\rangle$, we may express Equations~\ref{equ:dy-xA}
and~\ref{equ:dx-yA} as a single equation.
\begin{eqnarray}
\label{equ:rotordiff}\rho' & = & \ii A\rho
\end{eqnarray}
The elliptical rotor system of Equations~\ref{equ:dy-x}
and~\ref{equ:dx-y} may also be expressed as a single equation using
the conjugate operation.
\begin{eqnarray}
\rho' & = & \ii((A+B)\rho+(A-B)\conj{\rho})/2
\end{eqnarray}

\paragraph{Complex number to real power.}
Equation~\ref{equ:mult-add-angle} is also the key to understanding
exponentiation of complex numbers. Notice that the angle behaves {\em
logarithmically\/} with respect to addition and multiplication (think
of the analogous equation $\ln(xy)=\ln(x)+\ln(y)$). Since
exponentiation is essentially iterated multiplication, and complex
multiplication is additive on angles, complex exponentiation has a
multiplicative effect on angles. Consider first a complex number
$\alpha$ raised to the power of a real number $x$.
\begin{eqnarray}
|\alpha^x| & = & |\alpha|^x \\
\label{equ:comp-power-rotor}
\arg(\alpha^x) & = & x\arg(\alpha) \\
\alpha^{x+y} & = & \alpha^x\alpha^y \\
\alpha^{xy} & = & {(\alpha^x)}^y \\
{(\alpha\beta)}^x & = & \alpha^x\beta^x \\
\conj{\alpha}^x & = & \conj{\alpha^x} \\
\alpha^1 & = & \alpha \\
\alpha^0 & = & 1
\end{eqnarray}

Equation~\ref{equ:comp-power-rotor} begins to reveal the power of
complex numbers for analyzing vibration. Think of a rotor with
amplitude $R$, frequency $F$, and phase $0$ (starting value $R+\ii 0$
at time $t=0$). The state of the rotor at any time $t$ is
$\rho=R\ii^{4Ft}$ (we multiply $F$ by $4$ because $\arg(\ii)=\pi/2$ is
$1/4$ of $2\pi$ radians, which is a full rotation). For an elliptical
rotor with starting value $R_1$ that crosses the imaginary axis at
$\ii R_2$, the state at time $t$ is
$(R_1+R_2)\ii^{4Ft}/2+(R_1-R_2)\conj{\ii^{4Ft}}/2$. Equivalent
expressions include $(R_1+R_2)\ii^{4Ft}/2+(R_1-R_2)(-\ii)^{4Ft}/2$ and
$(R_1+R_2)\ii^{4Ft}/2+(R_1-R_2)\ii^{-4Ft}/2$. The last of these is the
most popular, and leads to the notion of a $-F$ frequency component in
an elliptical rotor.

\paragraph{Complex exponents.}
The most important fact about complex numbers for the study of
vibration is the rule for raising the real number $e=2.71828\ldots$ to
a complex power.
\begin{eqnarray}
\label{equ:euler}e^{\si y} & = & \cos(y)+\ii\sin(y) \\
\label{equ:e-to-comp}e^{x+\si y} & = & e^x(\cos(y)+\ii\sin(y)) \\
|e^{\alpha}| & = & e^{\Re(\alpha)} \\
\arg(e^{\alpha}) & = & \Im(\alpha)\bmod 2\pi \\
e^{\conj{\alpha}} & = & \conj{e^\alpha} \\
\alpha & = & |\alpha|e^{\si\arg(\alpha)} \\
\alpha & = & e^{\ln(|\alpha|)+\si\arg(\alpha)}\mbox{ for }\alpha\neq 0
\end{eqnarray}
Equation~\ref{equ:euler}, known as {\em Euler's formula\/} in
honor of the famous mathematician who discovered it, is the most
important single equation for the study of vibration. It allows us to
reason about trigonometric functions by using the relatively
easy-to-remember properties of exponentiation. Computer algebra
systems typically convert trigonometric formulae into exponential form
in order to simplify them more efficiently.

For our purposes, the derivation of Euler's formula is not as
important as the formula itself. To see why the formula is sensible,
consider the ordinary differential equation defining the exponential
function for real numbers:
\begin{eqnarray}
\label{equ:expdiff}x' & = & Ax
\end{eqnarray}
The most interesting solution to equation~\ref{equ:expdiff} is the one
with initial condition $x(0)=1$, and this leads to
\begin{eqnarray}
x(t) & = & e^{At}
\end{eqnarray}
That is, the (scaled) exponential function $e^{At}$ is characterized by
its inital value and the fact that its slope at each time $t$ is $A$
times its value at time $t$---the larger it gets, the faster it
grows. Notice that equation~\ref{equ:rotordiff} has the same form as
equation~\ref{equ:expdiff}, but it describes a complex-valued function,
and the multiplier is an imaginary number $\ii A$ rather than a real
number $A$. So, it is sensible to regard the natural solution to
equation~\ref{equ:rotordiff}, which is a rotor, as the function $e^{\ii At}$.

Euler's formula also gives us another way to represent each complex
number $\alpha$---instead of the usual form
$\Re(\alpha)+\ii\Im(\alpha)$ we may write
$|\alpha|e^{\si\arg(\alpha)}$. For $\alpha\neq 0$ we may also write
$e^{\ln(|\alpha|)+\si\arg(\alpha)}$. Unlike the additive Cartesian
form $x+\ii y$, the exponential polar $re^{\si w}$ form for a complex
number is not unique, since $re^{\si w}=re^{\si(w+2k\pi)}$ and
$0e^{\si w_1}=0e^{\si w_2}$.
\begin{equation}
r_1e^{\si w_1}=r_2e^{\si w_2}\mbox{ if and only if }
\begin{array}{c}
r_1=r_2=0 \\
\mbox{or} \\
r_1=r_2\mbox{ and }w_1=w_2\pmod{2\pi} \\
\mbox{or} \\
r_1=-r_2\mbox{ and }w_1=(w_2+\pi)\pmod{2\pi}
\end{array}
\end{equation}
In essence, exponentiation is a kind of conversion between Cartesian
and polar coordinates: the polar coordinates of $e^\alpha$ are
$|e^{\alpha}|=e^{\Re(\alpha)}$ and $\arg(e^{\alpha})=(\Im(\alpha)\bmod
2\pi)$. So, the Cartesian coordinates of $\alpha$ turn into the polar
coordinates of $e^{\alpha}$. Notice that $\Im(\alpha)$ is naturally
understood as a rotational angle, while $\arg(e^{\alpha})$ is a
directional angle.

Euler's formula (Equation~\ref{equ:euler}) allows an even nicer way to
analyze a rotor with amplitude $R$, frequency $F$, and phase $0$
(starting value $R+\ii 0=Re^0$ at time $t=0$): the state at any time
$t$ is just $\rho=Re^{\si 2\pi Ft}$ (such exponential expressions are
sometimes called {\em phasors\/} in the engineering literature). For
the elliptical rotor starting at $R_1$ and crossing the imaginary axis
at $\ii R_2$, the state at time $t$ is $((R_1+R_2)e^{\si 2\pi
Ft}+(R_1-R_2)\conj{e^{\si 2\pi Ft}})/2$, or $((R_1+R_2)e^{\si 2\pi
Ft}+(R_1-R_2)e^{-\si 2\pi Ft})/2$. And, it is particularly easy to
construct a complex number with magnitude $1$ to rotate other numbers
by a given angle $w$: use $e^{\si w}$. Look back at
Figures~\ref{fig:rotor-1} and~\ref{fig:helix} again, and interpret
them in terms of complex numbers.

Now, the way to understand exponentiation $\beta^{\alpha}$ with an
arbitrary complex base $\beta$ is to first write
$\beta=|\beta|e^{\si\arg(\beta)}$, and then use the rules for
exponentiation with base $e$.
\begin{eqnarray}
\label{equ:comp-to-comp}(re^{\si w})^{x+\si y} & = &
r^ve^{-wy}e^{\si(wx+y\ln(r))} \\
|\beta^{\alpha}| & = & |\beta|^{\Re(\alpha)}e^{-\arg(\beta)\Im(\alpha)} \\
\arg(\beta^{\alpha}) & = & \arg(\beta)\Re(\alpha)+\ln(|\beta|)\Im(\alpha)
\end{eqnarray}
These equations are rather complicated, and fortunately we will not be
using them much. Work them through for exercise with complex numbers,
and convince yourself that they follow from the earlier rules. Notice
how exponentiation mixes together the Cartesian coordinates of the
exponent with the polar coordinates of the base.

\paragraph{Complex logarithms.}
Euler's formula makes it easy to define the natural (base $e$) logarithm of a
complex number.
\begin{eqnarray}
\ln(\alpha) & = & \ln\abs{\alpha}+\ii\arg(\alpha)\mbox{ for }\alpha\neq0 \\
\label{equ:exp-ln}e^{\ln(\alpha)} & = & \alpha\mbox{ for }\alpha\neq0 \\
\Re(\ln(\alpha)) & = & \ln(\abs{\alpha}) \\
\Im(\ln(\alpha)) & = & \arg(\alpha)
\end{eqnarray}
$\ln(0)$ is undefined. Notice that, for positive real numbers $x>0$
there is a unique real number $y$ such that $e^y=x$. But, even for
positive real numbers $x$ there are infinitely many complex numbers
$\beta$ such that $e^{\beta}=x$. That is, while
Equation~\ref{equ:exp-ln} defines the natural logarithm uniquely as a
real value, it has infinitely many complex solutions, since
$e^{\alpha}=e^{\alpha+2k\pi}$ for all integers $k$. The particular
choice above for the imaginary part of $\ln(\alpha)$ is arbitrary,
just as the particular interval $[0,2\pi)$ for directional angles is
arbitrary. Notice that this choice restricts all complex logarithms
$\ln(\alpha)$ to the horizontal stripe in the complex plane where
$0\leq\Im(\ln(\alpha))<2\pi$.
\begin{eqnarray}
\log_{\beta}(\alpha) & = &
  \ln(\alpha)/\ln(\beta)\mbox{ for }\alpha,\beta\neq0
\end{eqnarray}

\subsection{Sound Signals in the Time Domain}

In general, the sounds that we would like to create and analyze are
much more complicated than the sounds produced by simple rotors. But,
we will continue to model sounds by complex-valued functions $\sigma$
depending on a real number parameter $t$ standing for time. Such
functions are called {\em sound signals in the time domain.}  Later,
in Chapters~\ref{cha:frequency-spectrum}
and~\ref{cha:time-varying-spectrum}, we will see other mathematical
representations of sound, but signals in the time domain are the
easiest models to relate intuitively to the physical signals that
enter the ear. Widely used digital input and output devices for sound
are also most easily understood in terms of signals in the time
domain. Most books and papers on sound consider real-valued time
signals, and most electronic devices, both digital and analog, for
analyzing or creating sound deal only with real values. Many analysis
and synthesis techniques, however, are best understood in terms of a
complex signal $\sigma$. We may always project the complex signal
$\sigma$ to a real signal by taking $\Re(\sigma)$. Just as many
systems for manipulating graphic images deal with three dimensional
models, and project them to two dimensions at the last stage before
displaying them on video screens, we will think of sound signals as
two dimensional, and project to one dimension at the last stage before
rendering them through loudspeakers.

Given a sound signal $\sigma$ in the time domain, and a particular
time $t$, the {\em instantaneous amplitude\/} of $\sigma$ at time $t$
is $\abs{\sigma(t)}$, the {\em instantaneous phase\/} is
$\arg(\sigma(t))$, and the {\em instantaneous frequency\/} is
$(d\arg(\sigma)/dt)(t)$. These are interesting quantities to discuss,
and may be useful in analyzing sound, but they do not necessarily have
the perceptual impact of the corresponding constant quantities
associated with a simple rotor.

Not every complex-valued function $\sigma$ of $t$ makes sense as a
sound signal in the time domain. Some reasonable relation must hold
between the real and complex components of $\sigma$. But, it is not
clear precisely what relation to require in general. Particular
physical interpretations of $\sigma$ impose certain constraints---for
example if $\Re(\sigma)$ is the velocity of a physical object, and
$\Im(\sigma)$ is the displacement of the same object, then
$\Re(\sigma)=d\Im(\sigma)/dt$. When $\Im(\sigma)=R\sin(Ft)$,
$d\Im(\sigma)/dt=FR\cos(Ft)$, so this derivative constraint forces
rotors to be elliptical, with aspect ratio proportional to the
frequency. Circular rotors are much more convenient mathematically.
Roughly speaking, we would like to restrict sound signals $\sigma$ so
that $\Im(\sigma)$ is essentially the same as $\Re(\sigma)$ with a
phase difference of $\pi$ ($90^{\circ}$)---such signals are said to be
{\em in quadrature,} since the angle $\pi$ is one quarter of the full
circle. The problem is that many different frequencies may be present
in $\sigma$. In Chapter~\ref{cha:frequency-spectrum} we see a precise
definition of this quadrature constraint.

Figures \ref{fig:decay-helix}, \ref{fig:fm-helix}, and
\ref{fig:helix-sum} show examples of sound signals in the time domain
that are slightly more complicated than the basic helix. In each case,
part (a) shows a three-dimensional plot of the complex-valued
function, and part (b) shows a two-dimensional plot of the real and
imaginary components.
\begin{figure}
\centerline{\hbox{\epsfbox{decay-helix.eps}}}
\caption{\label{fig:decay-helix}
The signal $e^{-t/8+\si 2\pi t/4}$: decaying helix}
\end{figure}
\begin{figure}
\centerline{\hbox{\epsfbox{fm-helix.eps}}}
\caption{\label{fig:fm-helix}
The signal $e^{\si 2^{t/8}2\pi t/4}$: increasing frequency}
\end{figure}
\begin{figure}
\centerline{\hbox{\epsfbox{helix-sum.eps}}}
\caption{\label{fig:helix-sum}
The signal
$e^{\si 2\pi t/4}+e^{\si 4\pi t/4}/2+e^{\si 8\pi t/4}/4+e^{\si 16\pi
t/4}/8$: sum of 4 helixes}
\end{figure}

\section{Exercises}
\begin{enumerate}
\item Take the spring system of
  Figure~\ref{fig:ideal-spring-resonator}, rotate the track to a
  vertical orientation, and let a constant gravitational force act on
  the mass. The stable position about which the mass oscillates is no
  longer at the point where the spring attaches to the track, but some
  distance below that point where the force exerted by the spring
  exactly cancels gravity. Does the frequency of the vibrating spring
  increase or decrease as a result of the influence of gravity? Explain
  briefly.
\item Consider a vibrating spring system in which the motion of the
  mass is opposed by a certain amount of friction. In order to analyze
  such a system, do we change Equation~\ref{equ:dy-v},
  Equation~\ref{equ:dv-y}, both, or neither? Explain briefly.
\item Consider a vibrating system with a mass that is attracted to the
  center of vibration by a gravitational force instead of a spring. In
  such a system, the period of vibration depends on the amplitude. As
  the amplitude of the vibration increases, does the period increase or
  decrease? Explain briefly.
\item Notice how Equation~\ref{equ:dy-xA} has a positive multiplier
  $A$, while Equation~\ref{equ:dx-yA} has a negative multiplier $-A$.
  There are three other possibilities: (a) both multipliers negative,
  (b) both multipliers positive, (c) the first multiplier negative and
  the second positive. Describe briefly and qualitatively the behavior
  of a system described by each of the variants (a--c). Draw pictures
  analogous to Figure~\ref{fig:rotor-1}(b) to help explain.
\item\label{exe:bigger-axis} When $A>B$ in Equations~\ref{equ:dy-x}
  and~\ref{equ:dx-y}, the path of the state vector $\langle x,y\rangle$
  is an ellipse. Which axis of the ellipse is longer, the $x$ axis or
  the $y$ axis? Explain briefly, using precise mathematical information
  derived from Equations~\ref{equ:dy-x} and~\ref{equ:dx-y}. Hint: Derive
  slightly different equations relating $C_yy'$ to $C_xx$ and $C_xx'$ to
  $C_yy$ for cleverly chosen constant multipliers $C_x$ and $C_y$.
\item Derive simple formulae representing the frequency and period of the
  vibrating system of Equations~\ref{equ:dy-x} and~\ref{equ:dx-y} in
  terms of the constants $A$ and $B$. Hint: Look at
  Equations~\ref{equ:x-cos} and~\ref{equ:y-sin}. Differentiate both
  sides of both equations. Solve the special case where $A=B$. Then,
  apply the scaling of $x$ and $y$ by constants $C_x$ and $C_y$ that you
  used in Exercise~\ref{exe:bigger-axis}.
\item In an elliptical rotor system obeying Equations~\ref{equ:dy-x}
  and \ref{equ:dx-y} the speed with which the state point travels
  around the ellipse is not constant.
  \begin{enumerate}
  \item Where is this speed the least, and where is it greatest?
    Explain briefly.
  \item Answer the same question for the {\em angular\/} speed of the
    state vector---the speed at which its angle with the $x$ axis
    changes.
  \end{enumerate}
\item For each of the following operations on complex numbers $\alpha$
  and $\beta$, state whether it is more convenient to represent each
  number in Cartesian or polar coordinates, or whether both are
  equally convenient. Sometimes the answer is different for $\beta$
  than for $\alpha$.
  \begin{enumerate}
  \item $\alpha+\beta$
  \item $\alpha-\beta$
  \item $\conj{\alpha}$
  \item $\alpha\beta$
  \item $\alpha/\beta$
  \item $\beta^{\alpha}$
  \item $\log_{\beta}(\alpha)$
  \end{enumerate}
\item Derive formulae for the Cartesian coordinates
  $\Re(\beta^\alpha)$ and $\Im(\beta^\alpha)$ of $\beta^\alpha$ in terms
  of Cartesian and/or polar coordinates of $\alpha$ and $\beta$.
  \item Derive the following trigonometric identities, using Euler's
  formula (Equation~\ref{equ:euler}) and easy algebraic
  manipulations of additions, subtractions, multiplications, and
  divisions of complex numbers. Note that $\cos^2(x)$ and $\sin^2(x)$
  are conventional ways of writing $(\cos(x))^2$ and $(\sin(x))^2$,
  respectively.
  {\renewcommand{\arraystretch}{1.5}
  $$\begin{array}{lrcl}
  \mbox{(a)} & \cos(2x) & = & \cos^2(x)-\sin^2(x) \\
  \mbox{(b)} & \sin(2x) & = & 2\cos(x)\sin(x) \\
  \mbox{(c)} & \cos^2(x) & = & (1+\cos(2x))/2 \\
  \mbox{(d)} & \sin^2(x) & = & (1-\cos(2x))/2 \\
  \mbox{(e)} & \cos(x)+\cos(y) & = & 2\cos((x+y)/2)\cos((x-y)/2) \\
  \mbox{(f)} & \sin(x)+\sin(y) & = & 2\sin((x+y)/2)\cos((x-y)/2) \\
  \mbox{(g)} & \cos(x)\cos(y) & = & (\cos(x-y)+\cos(x+y))/2 \\
  \mbox{(h)} & \sin(x)\sin(y) & = & (\cos(x-y)-\cos(x+y))/2 \\
  \mbox{(i)} & \sin(x)\cos(y) & = & (\sin(x-y)+\sin(x+y))/2 \\
  \mbox{(j)} & \cos(x+y) & = & \cos(x)\cos(y)-\sin(x)\sin(y) \\
  \mbox{(k)} & \sin(x+y) & = & \sin(x)\cos(y)+\cos(x)\sin(y)
  \end{array}$$}
\item We saw how to express the state of an elliptical rotor at time
  $t$ in the form $ae^{\si 2\pi ft}+b\conj{e^{\si 2\pi ft}}$, where $a$
  is the average of the real and imaginary intercepts of the ellipse,
  and $b$ is half their difference. Derive a nice formula for the state
  of an elliptical rotor whose major and minor axes are different from
  the real and imaginary axes.
\end{enumerate}

\chapter{Perceptual Foundations of Sound}
\label{cha:perceptual-foundations}

{\bf This chapter is a particularly rough draft, with a lot of missing
information.}

{\tiny In every section: limits of human sound perception.}

This chapter sketches the structure of human sound perception in a
deliberately simplistic and superficial way. I believe that the best
digital models for sound production will be informed by audiology, but
they will sacrifice a lot of perceptual precision for mathematical
simplicity. By rough analogy, there are lots of qualities of human
visual perception that are ignored by the pixel model of
graphics. Also, general-purpose models for sound production must
work for almost all listeners, so they cannot be designed around
details of perception that vary from person to person. Later chapters
will investigate more precisely the mathematical qualities of sound
that affect perception, but we will stick with a very approximate and
intuitive notion of perception itself.

We are interested in sound as a medium that may be used for
communication. Particular forms of audible communication, such as
music and speech, may be highly specialized to their purposes and to
the acoustic resources available to them for generating sound. There
must be some very general structural qualities of sound that are
present in essentially all uses of sound for communication. Each
particular form of audible communication may exploit these general
structural qualities in very different ways.

By rough analogy to visual communication, notice that almost all
visual scenes may be described in terms of structural concepts such as
region, edge, texture, color, brightness. Written communication in
English exploits the shapes of regions with contrasting brightness,
and the edges of those regions, to provide recognizable alphabetic
characters of the Roman alphabet. Architectural drawings exploit edges
in a radically different way. Perspective pictures draw on texture and
color in yet other ways to communicate layouts of physical objects. In
this chapter we seek an intuitive understanding of structural
qualities of sound roughly at the level of region, edge, texture,
color, brightness in video.

\section{The Ear as a Frequency Analyzer}

The key receptive structure in the ear is the {\em cochlea,} a
spiral-shaped tube containing lots of little hairs that vibrate with the
surrounding fluid. {\tiny Is it air or some body liquid?} From our point
of view, each hair is a physical realization of a rotor. Somehow (the
{\em how\/} is still the topic of some debate) each hair is tuned to a
narrow range of frequencies, and stimulates an assigned nerve ending
proportionally to the amount of excitation it receives within its
frequency range. So, the human ear is roughly a frequency analyzer,
passing on a spectral presentation of sound at each instant to the brain
for further analysis.

\section{Sound Imaging---What is ``a Sound''?}

I call a complex of sound that is presented to a listener an ``audible
scene.'' Many audible scenes deompose naturally into the sum of
several components that are perceived as units, vaguely analogous to
contiguous regions in a visual scene. The decomposition is often
ambiguous, and sometimes there is no sensible decomposition, but the
notion of a perceived contiguous piece of sound is likely to be useful
whenever it applies. I call such an intuitive unit in an audible scene
``a sound.'' In well-articulated musical pieces, a single note by a
single instrument is a sound. In speech, the notion is more ambiguous,
but perhaps a phoneme or segment of a phoneme may be understood as a
sound.

Automated analysis of audible scenes into individual sounds is
extremely difficult, because it must resolve all of the ambiguities
that arise. Synthesis by adding up individual sounds to create audible
scenes is much more tractable, since the instructions for synthesizing
a given scene can specify an interpretation explicitly. A synthesis
method based on adding individual sounds together might be very useful
even if it doesn't guarantee that every object described as ``a
sound'' by the system is perceived as a single sound---as long as
there is a good heuristic correlation between description and
perception the method can succeed.

The precise way in which the ear and brain decompose an audible scene
into individual sounds is not understood. The spatial location of
sound sources, as detected by the stereo effects of pairs of ears and
by the asymmetric distortion induced by the funny shapes of our heads
and external ears, certainly plays an important part. We will ignore
spatial location, not because it isn't important, but because {\em for
the purposes of synthesis,} it can probably be separated from monaural
qualities. To synthesize an audible scene, we may describe sounds,
then describe where each sound is placed, and these two parts of our
description may be essentially independent. For analysis, they are
probably tangled together inextricably.

Ignoring location, the qualities that make a particular complex
vibration sensible to regard as an individual sound probably have to
do with the frequency components of that vibration. We prefer to group
frequency components together perceptually when their beginnings, and
to a lesser extent their endings, are nearly simultaneous. Also, we
prefer to group frequencies that are very close to being integer
multiples of some audible frequency, which may or may not be present
itself---stated another way we prefer to associate frequencies whose
ratios are very close to rational numbers with small integer
numerators and denominators. These qualitative observations are {\em
very\/} far from providing a useful basis for analysis, but they may
serve as heuristic guides in considering synthesis techniques.

\section{Perceptual Parameters of a Sound}

\subsection{Pitch}

Pitch is the quality of a sound that leads us to consider it ``higher''
or ``lower'' than another sound. Some sounds, such as engine noises and
drum beats, yield only a vague sense of high or low pitch. Other sounds,
such as notes of bird songs and of melodic musical instruments, yield a
fairly precise sense of pitch that can be measured numerically, with most
listeners agreeing that the measurement is correct.

At first approximation, the pitch of a sound is its frequency. The
human ear detects frequencies from about 20 Hertz (cycles per second)
to about 20,000 Hertz. Perceived pitch is essentially the {\em
logarithm\/} of frequency. Multiplying a frequency is perceived as
adding to the pitch. For example, on a piano keyboard, the interval
(difference in pitch) called an {\em octave\/} is heard as the result
of moving up 12 {\em half steps\/} (the interval from one key to the
next higher---usually one is white and the other black), but it is
essentially a multiplication of the frequency by 2. The interval
called a {\em perfect fifth,} heard as moving up 7 half steps,
multiplies frequency by approximately $3/2$.

{\tiny We need to know the pitch resolution of the human ear.}

But, it's not that simple. Perception of pitch is affected by loudness
(loud sounds tend to sound higher in pitch than soft sounds of the
same frequency), and there may be many other small but significant
influences on perceived pitch. My hunch is that {\em most\/} of these
should not affect the structure of a general-purpose model of sound,
but rather should be viewed as fine points to be applied outside of
the model, when polishing a sound definition to its final form, only
when great precision is truly required. For most purposes, lots of
perceptual subtleties are best ignored.

One major complication in pitch perception probably {\em will\/}
affect the structure of good digital models of sound. Although pitch
is {\em essentially\/} the logarithm of frequency, perception of pitch
is tied more closely to the {\em relation\/} between a number of
component frequencies in a sound, rather than to the frequency of one
particular component. Specifically, when a sound is nearly {\em
harmonic\/}---when most of the frequency components of a sound are
nearly integer multiples of another audible frequency $F$, called the
{\em fundamental\/} pitch of that sound---we tend to hear a pitch given
by $\ln F$. {\em The frequency $F$ itself need not be present!} This
seems spooky at first, but it is probably a very sensible adaptation
of aural perception to the fact that some components of a sound may be
masked by noise. Perception of the ``missing fundamental'' is roughly
analogous to the visual perception of an entire object, even though
parts of it are hidden behind other objects.

The perception of pitch intervals is also a bit more complicated than
merely subtracting one pitch from another. When we perceive the pitch
interval between two nearly harmonic sounds $s_1$ with fundamental
frequency $F_1$ and $s_2$ with fundamental $F_2$, we seem to overlay
their component frequencies. If the component of $s_1$ with {\em
approximate\/} frequency $MF_1$ is close enough to the component of
$s_2$ with {\em approximate\/} frequency $NF_2$ ($M$ and $N$ are
integers), this influences us toward perceiving an interval determined
by $\ln N/M=\ln N-\ln M$, rather than $\ln F_1/F_2=\ln F_1-\ln F_2$.
Each pair of components that overlays closely enough influences the
perceived interval, and it is hard to characterize the way in which
these influences add up. But, there are plenty of sounds that are
nearly enough harmonic to have a musical effect, but far enough from
perfect integer ratios to confuse the perception of intervals. The
piano and the bagpipes are examples of instruments with substantial
deviations from harmonic sound, and the comparison of pitch between
them and nearly perfect harmonic sounds, such as the sounds of most
orchestral instruments, is quite tricky.

The precision of pitch perception is roughly constant within audible
frequency limits. Since pitch is the logarithm of frequency, this means
that frequency precision is much better for lower frequencies and poorer
for higher frequencies. Section~\ref{sec:audible-time} discusses the
perception of time for sound, which has a variation of precision inverse
to the variation of frequency precision.

\subsection{Loudness}

Loudness of a simple helical signal is roughly the logarithm of its
power (the rate at which it delivers energy). Notice that when two
helical signals have the same amplitude, the one with higher frequency
also has higher power, because it moves faster. The exact power in a
signal depends on the precise physical interpretation of the signal, but
in general the power in a helical signal $Re^{\ii 2\pi Ft}$ is
proportional to some polynomial in $R$ and $F$, and at least as big as
$RF$, so perceived loudness is roughly proportional to
$\ln(RF)=\ln(R)+\ln(F)$. But, perceived loudness varies according to the
sensitivity of the ear at the given frequency, so signals at frequencies
near the limits of audible frequencies seem softer than signals of equal
power near the center.

{\tiny We need information on the units used to measure loudness and the
limits of normal human perception.}

When a number of frequencies are present in a sound, it seems sensible
that the perceived loudness will be roughly proportional to the
logarithm of the sum of all power within audible frequency limits. This
seems sensible, but it's wrong. The perception of loudness in complex
sounds is influenced by the {\em critical bands\/} of human sound
perception---frequency bands containing a spread of frequencies roughly
spanning a musical minor third, so the highest frequency in a band is
roughly $6/5$ times the lowest. These bands are not discrete, rather
they overlap continuously across the range of audible frequencies,
varying slightly in width depending on the center frequency.

Two helical signals within a critical band tend to add their powers,
so that the perceived loudness within a critical band is close to the
logarithm of the total power in the band. But, two helical signals
whose frequencies differ by more than a critical band tend to add
their perceived loudness {\em after\/} the individual loudnesses are
taken as the logarithm of power. Since $\ln(x)+\ln(y)>\ln(x+y)$, the
same amount of sonic power sounds louder when spread over a larger
frequency range. The precise computation of loudness from power
spectrum is quite complicated because of the overlapping of critical
bands.

I doubt that the critical band concept will have an impact on the lowest
levels of sound modelling, but it clearly has a profound effect on
perception, and therefore on the construction of highly polished sounds.

\subsection{Timbre}

Two sounds of the same pitch and loudness may have recognizably
different qualities: for instance the sounds of string instruments vs.\
reed instruments in the orchestra. These distinguishing qualities of
sound are called {\em timbre,} and are sometimes compared to visible
color. Compared to pitch and loudness, timbre is not at all well
defined. It clearly has a lot to do with the relative strengths of
different frequency components of a sound, called the {\em partials.}
But, it is also affected seriously by some aspects of the time
development of partials---particularly but not exclusively by the
increase in amplitude of partials at the beginning of a sound, called
the {\em attack\/} in music. Different partials of a musical sound
typically increase at very different rates, and these differences are
crucial to the identification of a sound with a particular instrument.
For example, the sounds of brass instruments are recognized partly by
the quicker development of lower frequencies than higher frequencies.

At first approximation it seems that two sounds of different pitch will
have the same perceived timbre when the spectral content of one looks
just like the other, but shifted in frequency. For example a sound with
a component of amplitude 1 at 100 Hertz, amplitude 0.5 at 200 Hertz, and
amplitude 0.25 at 300 Hertz might be expected to be qualitatively
similar to one with amplitude 1 at 250 Hertz, 0.5 at 500 Hertz, and 0.25
at 750 Hertz. In this sort of case, the second sound might be produced
by recording the first one on tape, then playing it back with the tape
moving faster. The famous singing chipmunks demonstrate the fallacy in
this expectation---they do not sound at all like their creator singing
higher.

A more accurate notion of timbre must take into account the fact that
sound perception has adapted to the way that many sound producers,
including the human voice and most musical instruments, create their
sounds by a two-stage process. First, there is some sort of vibrating
structure, such as the vocal chord, violin string, oboe reeds, which may
follow the shifted partials model fairly well. But, the sound coming
from this first vibrating structure filters through another resonating
structure, such as the human head, the body of the violin, the body of
the oboe, which scales the amplitudes of partials according to its
responsiveness at different frequencies. The responiveness of the second
structure does {\em not\/} take a frequency shift when the incoming
pitch changes, so it changes the relative strengths of partials
depending on their absolute frequencies, and not just their ratios to
the given pitch. This filtering structure is sometimes called a {\em
formant filter,} because it may often be characterized by a small number
of highly resonant frequency bands, called {\em formants.} Human sound
perception seems to have adapted to recognizing the constancy of formant
filters when they are stimulated by a variety of incoming sounds at
different pitches. This is vaguely analogous to the tendency of human
visual perception to perceive the reflective properties of a given
pigment as its {\em color,} even under radically different illuminations
that may change the actual spectrum reaching the eye quite severely.

\subsection{Transient Effects}

\subsection{Sound Events}
\label{sec:audible-time}

Although abstract physics recognizes {\em time\/} as a single
one-dimensional continuum (at least for any single observer), different
intervals of time may be perceived as if they are in completely
different dimensions, depending on the lengths of the intervals and the
sorts of perceptible changes that occur during them. For example, in
visual perception, changes in electromagnetic flux on a scale of
millionths of a second are not perceived as time at all, but rather
determine the frequency of light, and thereby contribute to the
perception of color. Changes on a scale of tenths of a second or longer
are generally perceived as temporal events involving changes in visual
qualities, including color. The huge gap between the electromagnetic
time scale and the event-sequence time scale make it easy to classify
particular changes unambiguously into one class or the other.

Sound perception seems to have at least three time scales that are
perceived quite differently, and they all overlap to make things more
complicated.
\begin{enumerate}
  \item Changes in air pressure on a scale of about $1/20,000$th of a second
    to $1/20$th of a second are on the {\em sonic\/} time scale. They
    are not perceived as time developments at all, but determine the
    frequency of components of a sound, and thereby contribute to pitch,
    timbre, and loudness perception. Sonic time for sound is analogous
    to the electromagnetic time scale for visual perception.
  \item Slightly slower {\tiny what range?} changes in the amplitudes of
    various frequency components are on the {\em transitional\/} time
    scale. They are perceived as time developments, such as the attack
    initiating a musical note, but the exact time sequence is hard or
    perhaps impossible to trace perceptually. The special quality of the
    transitional time scale is demonstrated by playing sounds backwards.
    While a sequence of events played backwards may be recognized
    accurately, even if it is physically ridiculous (for example, a
    reversed movie of someone walking), the time-reversal of sound
    transitions makes us perceive them completely differently. I suspect
    that a person who heard a time-reversed sound for the first time,
    with no clue such as a view of the record being spun backwards,
    might not even recognize it as the time reversal of something. Even
    knowing that a sound is time reversed, it is difficult to tell
    intuitively what the forward version sounds like. I am not aware of
    any visual phenomenon analogous to the transitional time scale for
    sound.
  \item Changes in the frequency components of sound on a scale of
    {\tiny more accurate lower bound?} perhaps $1/30$th of a second and
    longer are often perceived as sequences of events. The {\em
    event-sequence\/} time scale for sound is analogous to the one for
    visual perception, and they operate in a similar range. For example,
    the sequence of notes in a scale, or the sequence of clicks in a
    rhythmic form, are perceived on the event-sequence time scale. The
    reversal of a sequence of events may be musically or physically
    peculiar, but it is relatively easy to recognize.
\end{enumerate}

Even the sonic and event-sequence time scales overlap for sound, with
the transitional scale in between and overlapping both. This makes the
understanding of time developments in sound quite subtle in some cases.
In particular, the boundaries are sensitive to frequency. For
low-frequency components of sound, the boundaries of the scales move
toward longer time intervals, and for high-frequency components they
move toward shorter intervals. It takes about one full period (rotation)
of a helix to recognize the frequency. So, changes in a helical
component of sound can only be detected when they are not too short
compared to the time of a complete period.

The inverse relation between frequency precision, which is best for low
frequencies, and time precision, which is best for high frequencies, is
striking. It is not an accident, but comes from fundamental physical
limitations, which limit the {\em product\/} of time and frequency
precision, so that when one improves, the other gets proportionately
worse. The same mathematical form produces the Heisenberg uncertainty
principle in quantum mechanics.

\chapter{Digital Sampled Sound}
\label{cha:digital-sound}

In Chapter~\ref{cha:phys-math-foundations} we modelled sound as a
function from a real value $t$ representing time to a complex value
$\sigma$ representing a two-dimensional state of some vibrating
system. In order to deal with sound digitally, we must somehow reduce
such a signal to a finite number of symbols from a discrete set of
possibilities, such as the bits $0$ and $1$ that current digital
computers use to represent all information. The digitization of sound
is normally achieved by two independent steps, each of which has
consequences for the fidelity with which sound is produced digitally.

\section{Discrete time}

\paragraph{Sound signals in the discrete time domain.}
The usual first logical step in digitizing sound is to approximate the
continuous domain of real values representing time by a discrete set
of equally spaced values. Let $S$ be a positive real number, called
the {\em sampling rate\/} (the number of samples to take in a unit of
time). ${\cal N}$ represents the set of all positive and negative
integers. ${\cal T}_S$ represents the domain of {\em discrete time\/}
with sampling rate $S$.
\begin{eqnarray}
{\cal N} & = & \{\ldots,-2,-1,0,1,2,3,\ldots\} \\
{\cal T}_S & = & \{k/S\setsep k\in{\cal N}\}
\end{eqnarray}
Every finitely represented sound spans some finite interval
$\{t_{\mathord{\mbox{\it min}}},\ldots,t_{\mathord{\mbox{\it max}}}\}$
rather than the infinite domain ${\cal T}_S$, but we may only listen
to a finite time-span of sound in a lifetime anyway, so the {\em
discretization\/} of time is much more important than the limitation
to a finite interval. A {\em sound signal in the discrete time
domain\/} with sampling rate $S$ is a complex-valued function
$\sigma$ on ${\cal T}_S$. When discussing the domain ${\cal T}_S$, we
write the members of the domain as
$\ldots,t_{-2},t_{-1},t_0,t_1,t_2,t_3,\ldots$, where
\begin{eqnarray}
t_k & = & k/S
\end{eqnarray}

\paragraph{Converting from continuous to discrete.}
Given a continuous sound signal $\sigma$, the obvious and natural
choice for a discrete sound signal to represent it is $\discr_S(\sigma)$
where
\begin{eqnarray}
\discr_S(\sigma)(t) & = & \sigma(t)\mbox{ for }t\in{\cal T}_S
\end{eqnarray}
$\discr_S(\sigma)$ is just $\sigma$ restricted to the domain ${\cal
T}_S$. Such a representation is inherently ambiguous---there are an
infinite number of different continous sound signals represented by
the same discrete sound signal. The confusion resulting from this
ambiguity is called {\em aliasing.}

\paragraph{Aliasing.}
{\bf ---Note: serious problems with functional notation---} Whenever
two continuous sound signals $\sigma_1$ and $\sigma_2$ agree on every
point in ${\cal T}_S$ ($\sigma_1(t)=\sigma_2(t)$ for all $t\in{\cal
T}_S$), then they have the same representation
$\sigma_d=\discr_S(\sigma_1)=\discr_S(\sigma_2)$ as a discrete sound
signal, and we say that $\sigma_1$ and $\sigma_2$ are {\em aliases.}
The famous problem of wagon wheels appearing to roll backwards in old
movies is an example of aliasing in a sampled video signal. But, when
we generate a real physical sound from $\sigma_d$, a listener can only
hear one of the infinitely many continous sound signals that it might
represent. This is undesirable in the case where we discretize one
continous sound, and the listener hears a different one. In
particular, two continuous helical signals are aliases if and only if
their amplitudes and phases are exactly the same, and their
frequencies are the same $\pmod{S}$.
\begin{equation}
\begin{array}{c}
\mbox{For }R_1,R_2>0\mbox{, }P_1,P_2\in[0,2\pi)\mbox{:} \\[2.5ex]
\discr_S(R_1e^{\si(P_1+2\pi F_1t)})=\discr_S(R_2e^{\si(P_2+2\pi F_2t)})
\mbox{ for all }t\in{\cal T}_S \\[2ex]
\mbox{ if and only if } \\[2ex]
R_1=R_2\mbox{ and }P_1=P_2\mbox{ and }F_1=F_2\pmod{S}
\end{array}
\end{equation}
For real-valued signals, there is even more aliasing. Frequencies
$F_1$ and $F_2$ may be aliased when $F_1=-F_2\pmod{S}$ and the
signals are out of phase by exactly $\pi$ (half a period).
\begin{equation}
\begin{array}{c}
\mbox{For }R_1,R_2>0\mbox{, }P_1,P_2\in[0,2\pi)
  \mbox{, }F_1,F_2\neq 0\pmod{S/2}\mbox{:} \\[2.5ex]
\discr_S(R_1\sin(P_1+2\pi F_1t))=\discr_S(R_2\sin(P_2+2\pi F_2t))
\mbox{ for all }t\in{\cal T}_S \\[2ex]
\mbox{ if and only if } \\[2ex]
R_1=R_2\mbox{ and }P_1=P_2\mbox{ and }F_1=F_2\pmod{S} \\
\mbox{or} \\
R_1=R_2\mbox{ and }P_1=P_2+\pi\pmod{2\pi}\mbox{ and }F_1=-F_2\pmod{S}
\end{array}
\end{equation}
For frequencies that are exact multiples of half the sampling rate,
there is even confusion about the amplitude and phase. In the case of
odd multiples of half the sampling rate, the samples are all equal in
magnitude, and alternating in sign. The amplitude of the samples
depends on the phase at which the samples are taken, which is the same
for each half wave.
\begin{equation}
\begin{array}{c}
\mbox{For }R_1,R_2>0\mbox{, }P_1,P_2\in[0,2\pi)
  \mbox{, }F_1=F_2=S/2\pmod{S}\mbox{:} \\[2ex]
\discr_S(R_1\sin(P_1+2\pi F_1t))=\discr_S(R_2\sin(P_2+2\pi F_2t))
\mbox{ for all }t\in{\cal T}_S \\[2ex]
\mbox{ if and only if } \\[2ex]
R_1\sin(P_1)=R_2\sin(P_2)
\end{array}
\end{equation}
In the special case where $P_1=P_2=0$, the amplitudes could have any
values. For multiples of the sampling rate, all samples have the same
value, and again there is a tradeoff between amplitude and phase.
\begin{equation}
\begin{array}{c}
\mbox{For }R_1,R_2>0\mbox{, }P_1,P_2\in[0,2\pi)
  \mbox{, }F_1=F_2=0\pmod{S}\mbox{:} \\[2ex]
\discr_S(R_1\sin(P_1+2\pi F_1t))=\discr_S(R_2\sin(P_2+2\pi F_2t))
\mbox{ for all }t\in{\cal T}_S \\[2ex]
\mbox{ if and only if } \\[2ex]
R_1\sin(P_1)=R_2\sin(P_2)
\end{array}
\end{equation}
Finally, for completeness, notice that an odd multiple of half the
sampling rate aliases with a multiple of the sampling rate precisely
when the phases are both $0$, so that all samples have the value $0$.
\begin{equation}
\begin{array}{c}
\mbox{For }R_1,R_2>0\mbox{, }P_1,P_2\in[0,2\pi)
  \mbox{, }F_1=0\pmod{S}\mbox{, }F_2=S/2\pmod{S}\mbox{:} \\[2ex]
\discr_S(R_1\sin(P_1+2\pi F_1t))=\discr_S(R_2\sin(P_2+2\pi F_2t))
\mbox{ for all }t\in{\cal T}_S \\[2ex]
\mbox{ if and only if } \\[2ex]
P_1=P_2=0
\end{array}
\end{equation}

\paragraph{Rendering a discrete sound signal as continuous sound.}
Having computed a discrete sound signal $\sigma_d$, we need to render
it through a loudspeaker or similar controllable vibrating device in
order to hear the sound. At some point in the rendering process,
$\sigma_d$ is converted to a continuous signal $\sigma_c$. It is
natural to choose one of the infinitely many continous sound signals
that $\sigma_d$ might represent. In particular, it is natural to
create $\sigma_c$ by {\em interpolating\/} values between the ones
given by $\sigma_d$ in such a way as to make the resulting sound
signal as smooth as possible, according to some appropriate definition
of smoothness. The interpolating is normally done, not by a digital
computation, but by the analog machinery, usually electronic,
controlled by the computation. In Chapter~\ref{cha:frequency-spectrum}
we find that when the sampling rate is high enough, the precise nature
of the interpolation is relatively unimportant. But, the sorts of
analog devices commonly used for sound production typically
interpolate so that the final result is close to a sum of sinusoidal
signals of the lowest possible frequency. So, if $\sigma_d(t)=Re^{\si
2\pi Ft}$, is a discrete helical signal, it is normally rendered as
something very close to the continuous signal $\sigma_c(t)=Re^{\si
2\Pi(F\bmod S)t}$.

Not all aliases of a helical or sinusoidal signal are helical or
sinusoidal themselves. For real-valued signals the frequencies, the
frequencies near half the sampling rate ($S/2$) alias to signals that
are amplitude modulations of a carrier with frequency $S$ (see
Figure~???). In many cases, these amplitude-modulated signals
represent the way that a rendered signal is likely to be heard. For
complex-valued helical signals, the problem of non-helical aliases
seems to be less important, but I know very little about the rendering
of complex-valued signals. It is interesting to note that we seem to
need two real numbers per period to represent a given frequency,
whether those two reals are separate samples or whether they are
bundled into a single complex sample. But, we really need only the
sign of the imaginary component, along with the entire value of the
real component of a complex sample, to resolve the ambiguity between
helical frequencies $F_1=-F_2\pmod{S}$, although the full value of
both real and complex components is required to get full information
about amplitude and phase, and about multiple frequency components.

\paragraph{Causes of and cures for aliasing.}
Aliasing occurs any time a continuous signal is converted to a
discrete signal by sampling. It is natural to think of the case where
a physical continuous signal is read by an electronic sampler, but
this is not really the most important cause of aliasing. The engineers
who design and build samplers are pretty smart, and they have had
plenty of time to worry about aliasing and find ways to prevent its
harmful consequences. The most troublesome cases of aliasing arise
when a continuous mathematical model of a sound signal is converted to
a calculation of samples. The original continuous model may only exist
in the mind of a person who is designing sound---it need not be
present as a data structure in a computer, or in any other realization
in an artificial medium. Even when there is an explicit representation
of the continuous sound signal available as a data structure, the
problem of avoiding aliasing in software is far more complex, due to
the variety of conceptual sources for continuous signals. Flexible
sound-processing software has largely failed to prevent the
introduction of harmful aliasing in sampled signals.

The only cure for the harmful consequences of aliasing is
prevention. Once a continuous sound signal has been replaced by a
sampled discrete representation, and the continuous signal is no
longer available for inspection, there is no way to determine which of
the infinitely many possible continuous signals was truly intended. In
order to prevent one continuous sound signal $\sigma_1$, converted to
the discrete signal $\sigma_d={\cal D}_S(\sigma_1)$, from being
rendered continuously as some alias $\sigma_2$ that sounds quite
different, we must sample only signals that will be rendered
accurately. With the usual ``smooth'' rendering techniques, a sampled
complex-valued signal produces frequencies in the range $[0,S)$, and a
sampled real-valued signal produces frequencies in the range
$[0,S/2)$. To avoid harmful aliasing, all higher frequencies must be
filtered out from the continuous signal before sampling. For this
reason, sampling converters have analog filters that eliminate high
frequencies before sampling. Even though digital filters have many
advantages, they cannot be applied to the aliasing problem, because
they cannot distinguish frequencies that differ by multiples of the
sampling rate. To avoid the aliasing of frequencies near $S/2$ with an
amplitude-modulated signal (and presumably there are simlar problems
for complex-valued signals at frequencies near $S$), continuous
signals should in fact be filtered to an even smaller frequency
interval, but it is not clear precisely how much smaller it needs to
be.

\paragraph{Discrete signal values other than samples.}

\begin{itemize}
\item interval values
\item average values---equivalent to filtering
\end{itemize}

\section{Quantized Vibration State}

Even a finite time segment from a discrete sound signal is an infinite
object if the sample values are complex or real numbers. In order to
get a completely digital representation of a sound signal, we must
also approximate the continuous range of real or complex numbers by a
discrete subset. Since the consequences of this quantization of the
domain of values are largely independent of the consequences of
discretizing the time domain, we consider signals from the continuous
time domain to a discrete subset of the complex or real numbers.

\paragraph{Discrete sets of complex or real values.}
A subset ${\cal V}$ of the complex numbers is {\em discrete\/} if we
may draw a circle around each point in ${\cal V}$, so that each circle
contains only one point in ${\cal V}$. If ${\cal V}$ contains only
real numbers, then it is also a discrete subset of the reals. While
the discretization of time seems to make sense only with a constant
interval between points, there are a number of different popular ways
to quantize the real or complex values $\sigma(t)$. For the domain of
real numbers, the two basic ideas are {\em linear\/} and {\em
logarithmic\/} quantization.

Given a real number $Q>0$, ${\cal V}_Q$ represents the linear
quantization of the real domain with quantum interval $Q$.
\begin{eqnarray}
{\cal V}_Q & = & \{k/S\setsep k\in{\cal N}\}
\end{eqnarray}
Yes, this is mathematically the same thing as the discrete time domain
${\cal T}_Q$, but we think of it as having a different physical
dimension. Just as in the case of the discrete time domain, a digital
representation requires that we limit ourselves to a {\em finite\/}
interval within ${\cal V}_Q$, but it is the use of a discrete subset
of values, rather than the limitation to a finite interval, that has
the most interesting consequences for sound modelling.

Given two real numbers $B>1$ and $M>0$, ${\cal L}_{B,M}$ represents the
logarithmic quantization of the real domain with base $B$ and minimum
nonzero value $M$.
\begin{eqnarray}
{\cal L}_{B,M} & = &
  \{0\}\cup\{B^k,-B^k\setsep k\in{\cal N}\mbox{ and }B^k\geq M\}
\end{eqnarray}
Notice that the minimum value $M$ is required to make the domain
discrete, even if the $0$ value is omitted. ${\cal L}_{B,M}$ is an
idealized abstraction of several different essentially logarithmic
quantizations, such as ``mu-law'' encoding, but it does not represent
them precisely. The crucial quality of logarithmic domains is that the
interval between points goes up exponentially with the magnitude of
the points: the ${\cal L}_{B,M}$s are the mathematically simplest sort
of domains with that crucial quality. Floating-point domains are a
funny hybrid of linear and logarithmic: they consist of finite
segments of different linear domains pieced together so that the
progression over larger segments is essentially logarithmic.

The usual way to quantize the complex domain is to pick a quantization
of the real domain, and then apply it to the real and imaginary
components of complex numbers. So, we can define
\begin{eqnarray}
{\cal V}^2_Q & = & \{x+\ii y\setsep x,y\in{\cal V}_Q\} \\
{\cal L}^2_{B,M} & = & \{x+\ii y\setsep x,y\in{\cal L}_{B,M}\}
\end{eqnarray}
Polar versions, others.

\begin{itemize}
\item Quantization (roundoff) noise.
\item Nonlinear encodings.
\item Recovering complex data from real.
\item Why discretization and quantization are studied in such different ways.
\end{itemize}

\section{Other Ways to Digitize a Sound Signal}

\begin{itemize}
\item Delay conversion to discrete sampled representation as long as
possible (analogy to bitmapping in graphics).
\end{itemize}

\section{Direct Manipulation of Digital Sampled Sound}

\begin{itemize}
\item Time shifting.
\item Amplitude scaling.
\item Amplitude clipping.
\item Frequency/speed shifting.
\item Time stretching by repetition of a ``period''.
\item Changing sampling rate.
\item AM enveloping.
\item Adding sounds.
\item Nonlinear waveshaping.
\item Frequency modulation.
\end{itemize}

\chapter{The Frequency Spectrum}
\label{cha:frequency-spectrum}

In this chapter we investigate the analysis of a sound signal into its
components at different frequencies. For the moment, we are concerned
only with {\em steady-state\/} sound signals.  Imagine that you walk
into a room in which some sound is in the air.  You stay for some
length of time much longer than the period of any frequency that you
can hear, and then leave the room. Suppose that, while you are in the
room, the sound is essentially stable: you do not hear any change in
its quality. Roughly speaking, a steady-state sound signal is the
infinite extension of such a stable sound into the past and the
future---a sound that has always been and will always be qualitatively
the same. It is natural and sensible in a mathematical analysis of a
stable signal to ignore the fact that it has a beginning and an end.

\section{Pure Helical, Periodic, and Quasiperiodic Signals}

In order to analyze a signal into its components at different
frequencies, we need to know what each component is like. We choose
the standard helixes $Re^{\si(P+2\pi Ft)}$, characterized by amplitude
$R>0$, phase $P\in[0,2\pi)$, and frequency $F>0$, as the components
for analysis. In principle, there are infinitely many other choices
for the basic components: square waves, triangular waves, pulses, etc.
We choose the helixes because they are mathematically very simple and
suitable for analysis, and they agree quite well (but not perfectly)
with the physical qualities of vibrating sound producers, as well as
the sound detectors in the ear. In principle, the elliptical helixes
with aspect ratios proportional to frequency match our physical
analysis of a vibrating spring better than the circular helixes. But,
we choose the mathematical simplicity of the circular helixes, and
trust an intuitive hunch that they will be satisfactory for our
practical needs.

A sound signal $\sigma$ is {\em periodic\/} if there is a positive
real number $P>0$, called the {\em period\/} of $\sigma$, such that
for all times $t$, $\sigma(t+P)=\sigma(t)$ (actually, the {\em
smallest\/} such $P$ is the period of $\sigma$, and all multiples of
$P$ satisfy the same equation). The helical signal $Re^{\si(P+2\pi
Ft)}$ is periodic, with period $1/F$. If $\sigma_1,\sigma_2,\ldots$
are all periodic signals, and the ratios of their periods are rational
numbers, then $\sum_i\sigma_i$ is also periodic. If the ratios of
their periods are not all rational, then the sum is not periodic, but
it may still be analyzed into its components at different
frequencies---such signals are called {\em quasiperiodic.} Since
physical measurements can never distinguish absolutely between
rational and irrational values, it makes sense that we need to study
quasiperiodic signals in essentially the same way as the periodic
ones.

\section{The Ear as a Spectral Analyzer}

{\bf This section repeats Chapter~\ref{cha:perceptual-foundations}, and
should be merged in with that chapter.}

The part of the mammalian ear that detects sound is called the {\em
cochlea.} The cochlea is a tube (wound into a spiral, but that is not
particularly important to us) containing a long sequence of tiny
hairs. For our current purposes, each of those hairs is essentially a
vibrating spring, tuned to a different frequency. The number of hairs
is finite, but it is large enough that we will ignore the discreteness
of the set of hairs, and suppose that every possible value in the
continuous spectrum of real positive frequencies has a hair tuned to
it. In Chapter~\ref{cha:time-varying-spectrum} we find that there are
other limitations on the accuracy with which frequency is measured in
the ear, besides the finite number of detecting hairs. We also ignore
the physiological limits on the range of frequencies. The most natural
way to analyze signals is to find their components at all frequencies,
and accept the fact that some of those frequencies are indetectible by
a given ear.

Intuitively, the {\em spectrum\/} of a steady-state sound signal
$\sigma$ is the function $\psi$ mapping each frequency $f>0$ to a
complex number $\psi(f)$ such that $\abs{\psi(f)}$ is the amplitude of
the stimulation delivered by $\sigma$ to an idealized ear, and
$\arg(\psi(f))$ is the phase of the stimulation. The spectrum $\psi$
is often called a {\em signal in the frequency domain,} to complement
the description of $\sigma$ as a signal in the time domain. The
perception of steady-state sound signals is explained very well (but
not perfectly) in terms of the spectrum of the signal. In fact, most
of the perception of steady-state sound appears to depend only on the
magnitude $\abs{\psi}$, but in Chapter~\ref{cha:time-varying-spectrum}
we find that the perception of changes in the spectrum depends on the
relative phases of different components, so it is best to define the
spectrum to include phase information.

\subsection{Perceptual Parameters of Sound}

It is tempting to think of the perceptual qualities of a steady-state
sound signal that are derived from its spectrum as being analogous to
the {\em color\/} of an optical signal. Such an analogy probably leads
to more misunderstanding than useful insight. Human color perception
depends only on three dimensions of the frequency spectrum of light,
while the ear can distinguish at least hundreds of
frequencies. Considering the number of independent parameters
involved, a better analogy would relate each frequency component of a
sound signal to a single point in a visual scene. This analogy also
breaks down, because the ear relates individual frequency components
in ways that are fundamentally and structurally different from the way
the eye relates different points in a scene.

First, different frequency components of a steady-state sound signal
are often grouped together and perceived as a unit, which I call {\em
a sound.} A steady-state sound signal may contain many individual
sounds going on simultaneously. A particular frequency, typically the
lowest of those in a sound, may dominate the perceptual identification
of frequency in a sound: such a dominant frequency is called the {\em
fundamental\/} frequency of a sound; all of the component frequencies
are called {\em partials\/} of the sound. In particular, components
with frequencies that are very close to reasonably small integer
multiples of a single frequency $F$ ($1F,2F,3F,\ldots$), are often
heard as a single sound (a lot of information other than the spectrum
may affect this grouping, particularly stereo effects that seem to
locate components in space). In this case, the sound is a {\em
(nearly) harmonic\/} sound, $F$ is the fundamental, and the partial
$kF$ is called the {\em $k$th harmonic.} In many cases not all of the
harmonics are present (for example, the presence of only odd harmonics
is very common). Even the fundamental frequency $F$ may not be present
in a harmonic sound, but it still dominates perception of the
identifying frequency of the sound.

For a harmonic sound, and for many others as well, the ear perceives a
quality of highness or lowness called {\em pitch.} Pitch is
essentially determined by the fundamental frequency (even when that
frequency is not actually present), but the less harmonic a sound is
the more subtle is the determination of a fundamental
frequency. Doubling the fundamental frequency typically produces a
perception of an additive increment in the pitch---the pitch increment
associated with doubled fundamental frequency is called an {\em
octave\/} in conventional European music. So, the perceived pitch of a
sound is roughly the logarithm of the fundamental frequency. For
perfectly harmonic sounds this definition works very well; for nearly
harmonic sounds the perception of pitch intervals is affected by the
inaccuracies in the nearly integer ratios of partials to fundamental
frequencies. Notice that the essentially logarithmic relationship of
pitch to frequency means that when a sound signal $\sigma$ containing
several sounds is transformed by time-scaling to the signal $[
t]\sigma(St)$, the pitch intervals between the sounds stays the same.
The tendency of frequency components to cluster into harmonic sounds
is consistent with the logarithmic relationship of pitch to frequency,
since the partials of different harmonic sounds will interleave in the
same way as long as those sounds are separated by the same pitch
interval.

While the fundamental frequency of a steady-state sound typically
determines its pitch, the relative amplitudes of the frequency
components produce a perceived quality of sound that is called {\em
timbre.} Even with all the subtleties in determining perceived pitch,
the perception of timbre is orders of magnitude more subtle, and has
never been characterized with precision. Some acoustical scholars
believe that the word ``timbre'' is simply a convenient label for
those qualities of sound that we cannot describe or analyze
satisfactorily, much in the way that ``intelligence'' sometimes seems
to be used as a pleasant label for those aspects of human behavior
that we want to admire, but cannot explain. My hunch is that timbre is
susceptible to a much better analysis than has been achieved so far,
but not necessarily to a complete analysis. Timbre perception
certainly has a lot to do with the relative amplitudes of partials,
but is also affected crucially by the initiation of a sound, the
relation of amplitudes of partials to their absolute frequencies
(rather than just the ratios with fundamental frequencies), and
probably to a lot of other things that nobody has thought of yet.

The other perceptual quality of sound that has been analyzed fairly
well is {\em loudness.} While pitch is naturally associated with an
individual sound, loudness is perceived both for individual sounds and
for entire sound signals. Loudness is determined by the amplitude of
sound. Although the mathematically simplest measure of the amplitude
of a helical signal $Re^{\si2\pi Ft}$ is the multiplier $R$, perceived
loudness is better related to the {\em power\/} of the signal, which
is proportional to $R^2F$. In a system based on elliptical helixes, if
the aspect ratio of the ellipses grows proportionally to frequency,
then product of the lengths of the axes is proportional to power, but
this advantage of elliptical helixes does not seem to justify their
extra complexity. Just as perceived pitch is logarithmically related
to frequency, perceived loudness is logarithmically related to power,
so the {\em decibel\/} system for measuring loudness associates an
additive increase of $6$ decibels with a doubling of power. Perceived
loudness is also affected by frequency in at least two ways. First,
loudness naturally tends to drop off as the frequency of a signal
approaches the limits of frequency perception in the ear. Second, when
several frequency components combine, the percieved loudness of the
combination is different depending on whether the frequencies are
close together (lying in what are called {\em critical bands\/}) or
farther apart. My hunch is that the dependence of perceived loudness
on frequencies should not affect the structure of a basic,
general-purpose system for sound modelling, but it certainly will
become very important to a sound designer for the finer polishing of a
sound signal.

\section{Mathematical Spectral Analysis with the Fourier Transform}

The {\em frequency spectrum\/} of a sound signal $\sigma$ is another
complex-valued function $\psi$ of one real parameter, but in this case
the parameter represents {\em frequency\/}, rather than time, and the
value of $\psi$ at a frequency $f$ describes the component of $\sigma$
at frequency $f$. In particular, $\abs{\psi(f)}$ is the magnitude of
the component at frequency $f$, and $\arg(\psi(f))$ is the phase of
that component. If $\sigma$ is described by a formula in the form
$$\sigma(t)=R_1e^{\si(P_1+2\pi F_1t)}+R_2e^{\si(P_2+2\pi
F_2t)}+\cdots$$ then it is easy to see that its spectrum $\psi$ should
have value $\psi(f)=0$ for all frequencies $f$ not in the list
$F_1,F_2,\ldots$, and it appears that its value at $F_i$ should be
$\psi(F_i)=R_ie^{\si P_i}$. This is the right basic idea, but for
mathematical consistency with more complex spectra, the value
$\psi(F_i)$ is interpreted in a slightly more peculiar way, described
in Section~\ref{sec:generalized-functions}. But, what if we are given
a completely unknown signal $\sigma$, and need to characterize its
frequency spectrum? Just as, in the physical world, a {\em prism\/} is
used to analyze a light signal into its different frequency
components, in the mathematical world the {\em Fourier transform\/}
analyzes a mathematically given signal in a similar way.

The essential idea behind the Fourier transform is to perform a kind
of {\em pattern matching\/} between the given signal $\sigma$ and each
of the standard helical components $e^{\si2\pi Ft}$. The strength of
the match will determine the value $\psi(F)$ of the spectrum at
frequency $F$. Through a fortunate stroke of mathematics, we need not
test all the different amplitudes $R$ and phases $P$ in the form
$Re^{\si(P+2\pi Ft)}$, because our definition of pattern-matching
produces information about all amplitudes and phases as a result of
matching against any helix of the right frequency. To understand how
the peculiar integral formula that we will introduce as the definition
of the Fourier transform represents a kind of pattern matching, we
take a detour through simpler sorts of signals.

Suppose that we are given two signals $\sigma_1$ and $\sigma_2$ over a
{\em discrete and finite\/} time domain $\{1,2,\ldots,n\}$, and
suppose in addition that both signals take only the values $1$ and
$-1$. At any time $t$, the product $m=\sigma_1(t)\sigma_2(t)$ is $m=1$
if the signals agree, and $m=-1$ if the signals disagree. So, the sum
$$M(\sigma_1,\sigma_2)=\sum_{t=1}^n\sigma_1(t)\sigma_2(t)/n$$
represents the accuracy with which $\sigma_1$ matches $\sigma_2$---a
perfect match gives the value $M{\sigma_1,\sigma_2}=1$; a complete
failure to match gives the value $M(\sigma_1,\sigma_2)=-1$, and
partial matches give values between $-1$ and $1$. From another point
of view, $(M(\sigma_1,\sigma_2)+1)/2$ is the probability that
$\sigma_1$ and $\sigma_2$ agree at a randomly chosen time
$t\in\{1,2,\ldots,n\}$.

Now, generalize $\sigma_1$ and $\sigma_2$ to real-valued functions on
the same discrete and finite time domain. The product
$\sigma_1(t)\sigma_2(t)$ is still a very reasonable measure of the
extent of agreement between $\sigma_1$ and $\sigma_2$ at time $t$.
$m>0$ when $\sigma_1$ and $\sigma_2$ have the same sign, $m<0$ when
they have opposite sign, and the magnitude $\abs{m}$ indicates the
strength of their agreement or opposition in a way that credits
agreement with large numbers more than agreement with small numbers.
Again, the extent of agreement or disagreement between the entire
signals $\sigma_1$ and $\sigma_2$ may be taken as the average
$M(\sigma_1,\sigma_2)$ of the products. Now, however, the average is
no longer restricted to the interval $[-1,1]$, but might be any real
number. If $\sigma_1$ is a function that we are analyzing, and
$\sigma_2$ is a pattern that we are comparing it to, then
$M(\sigma_1,\sigma_2)$ gives information about the matching of $M$
against all multiples of $\sigma_2$ as well, since
$M(\sigma_1,S\sigma_2)=SM(\sigma_1,\sigma_2)$.

What if $\sigma_1$ and $\sigma_2$ are complex-valued functions on the
discrete and finite time domain? The product $\sigma_1(t)\sigma_2(t)$
is no longer a good measure of agreement at time $t$, because of the
way that complex multiplication adds the angles of multiplicands. For
example, if $\arg(\sigma_1(t))=\arg(\sigma_2(t))=\pi/2$, then
$\arg(\sigma_1(t)\sigma_2(t))=\pi$, and $\pi$ is the angle of the
negative real numbers. Similarly, if $\arg(\sigma_1(t))=\pi/2$ and
$\arg(\sigma_2(t))=3\pi/2$, then $\arg(\sigma_1(t)\sigma_2(t)=0$, so
the product is a positive real number. These results are the opposite
of what we want, since in the first case the signals agree in
direction, but the product is negative, and in the second case the
signals are opposite in direction but the product is positive. Instead
of the product $\sigma_1(t)\sigma_2(t)$, we need the product of one
signal with the {\em conjugate\/} of the other:
$m=\sigma_1(t)\conj{\sigma_2(t)}$. Notice that
$$\arg(\sigma_1(t)\conj{\sigma_2(t)})=
\arg(\sigma_1(t))+\arg(\conj{\sigma_2(t)})=
\arg(\sigma_1(t))-\arg(\sigma_2(t))\pmod{2\pi}$$
So, when $\sigma_1(t)$ and $\sigma_2(t)$ are in the same direction,
$m$ is a positive real number (angle $0$), when they are in opposite
directions $m$ is a negative real number (angle $\pi$), and in other
cases the angle of $m$ gives the amount of disagreement between the
angles of $\sigma_1(t)$ and $\sigma_2(t)$. In the special case where
$\sigma_1$ and $\sigma_2$ are real-valued, $m=\sigma_1(t)\sigma_2(t)$,
since each real number is its own conjugate. But, the mathematical
symmetry between $\sigma_1$ and $\sigma_2$ in the real-valued case is
lost in the complex-valued case: pattern matching the same two signals
in opposite order yields results that are conjugates of one another.
Let
$$M(\sigma_1,\sigma_2)=\sum_{t=1}^n\sigma_1(t)\conj{\sigma_2(t)}/n$$
Because
$M(\sigma_1,\cconst{\alpha}\sigma_2)=\conj{\cconst{\alpha}}M(\sigma_1,\sigma_2)$
for all complex constants $\cconst{\alpha}$, the result of pattern
matching $\sigma_1$ against a single pattern $\sigma_2$ gives
information about the matching of $\sigma_1$ against all signals that
are the same as $\sigma_2$ except for amplitude and phase. In
particular, matching against $e^{\si 2\pi Ft}$ gives information about
the match with all helixes $Re^{\si(P+2\pi FT)}$ of the same
frequency.

Now, to define the Fourier transform $\transf{F}(\sigma)$ of a signal
$\sigma$, we need only generalize the ideas above to the infinite
continuous time domain, using the integral as the natural continuous
analog of the sum. $\transf{F}(\sigma)$ is itself a function of a real
variable $f$ representing frequency, and $\transf{F}(\sigma)(f)$ is
the result of pattern matching $\sigma$ against a standard helix at
frequency $f$ ($e^{\si 2\pi ft}$). Notice that $\conj{e^{\si 2\pi
ft}}=e^{-\si 2\pi ft}$, and it is conventional to use the $-\ii 2\pi$
form in the formula for the transform.
%
\begin{eqnarray}
\label{equ:fourier-transform}
\transf{F}(\sigma)(f) & = &
  \int_{-\infty}^{\infty}\sigma(t)\conj{e^{\si 2\pi ft}}dt =
  \int_{-\infty}^{\infty}\sigma(t)e^{-\si 2\pi ft}dt
\end{eqnarray}
%
Although the first integral formula above displays the character of
the Fourier transform as the results of pattern-matching a signal
against the standard helixes, the second form is the one commonly seen
in books and papers about the transform. There are a number of
variations in the definition of the Fourier transform---some authors
give it as $\int_{-\infty}^{\infty}\sigma(t)e^{-\si ft}dt$, others as
$(2\pi)^{-1/2}\int_{-\infty}^{\infty}\sigma(t)e^{-\si ft}dt$---but
these variations are merely the result of scaling to different units
of measurement. The form chosen in
Equation~\ref{equ:fourier-transform}, which measures frequency in
cycles per unit time, is the mathematically most convenient one for
our purposes.

Other variations on the Fourier transform normalize the magnitude of
points in the spectrum in sensible ways. Since the power in the
helical signal $Re^{\si 2\pi ft}$ is proportional to $R^2f$, rather
than to $R$, we might argue for one of the following forms
\begin{eqnarray}
\label{equ:fn-fourier-transform}
\transf{F}_{\mbox{\scriptsize\it freq}}(\sigma)(f) & = &
  f^{-1}\transf{F}(f) \\
\label{equ:pn-fourier-transform}
\transf{F}_{\mbox{\scriptsize\it power}}(\sigma)(f) & = &
  f^{-1}\abs{\transf{F}(f)}^{(1/2)}e^{\si\arg(\transf{F}(f))}
\end{eqnarray}
Or, if we interpret the real and imaginary components of the signal in
such a way that the real component is the derivative of the imaginary
component, then it could make sense to match the signal against
elliptical helixes:
\begin{eqnarray}
\label{equ:ell-fourier-transform}
\transf{F}_{\mbox{\scriptsize\it ellipse}}(\sigma)(f) & = &
  \int_{-\infty}^{\infty}\sigma(t)
    ((f+1)e^{-\si 2\pi ft}+(f-1)e^{\si 2\pi ft})/2dt \nonumber \\*
  & = & ((f+1)\transf{F}(f)+(f-1)\transf{F}(-f))/2
\end{eqnarray}
Finally, since perceived pitch is roughly the logarithm of frequency,
a sensible variation is
\begin{eqnarray}
\label{equ:logfreq-fourier-transform}
\transf{F}_{\mbox{\scriptsize\it logfreq}}(\sigma)(p) & = &
  \int_{-\infty}^{\infty}\sigma(t)e^{-\si 2\pi 2^pt}dt \\
  & = & \transf{F}(\sigma)(2^p)
\end{eqnarray}
The logarithmic frequency scale rules out negative frequencies, which
may be an advantage or a disadvantage in different contexts. Since
each of these variations is easy to calculate from the conventional
transform in Equation~\ref{equ:fourier-transform}, we stick with that
simpler formula.

When using complex-valued signals in the time domain, it seems most
sensible to use only positive frequencies---that is, signals $\sigma$
whose spectra $\psi$ have the property that $\psi(f)=0$ for $f<0$.
But, the mathematics of the Fourier transform allows
negative-frequency components, and it is best to understand the
mathematics in its full generality, and then make whatever
restrictions seem appropriate in a given application. Real-valued
signals always have negative-frequency components. Also, elliptical
helixes have negative-frequency components when the spectrum is
defined in terms of circular helixes; similarly, circular helixes have
negative-frequency components in a spectrum defined in terms of
elliptical helixes.

We see in the remainder of this chapter how the Fourier transform
$\transf{F}(\sigma)$ gives a sensible representation of the spectrum
of the steady-state signal $\sigma$. It is also important to calculate
a sound signal in the time domain from a given frequency spectrum. For
this purpose, we need the {\em inverse Fourier transform.} The Fourier
transform is almost self-inverting.
\begin{eqnarray}
\label{equ:repeat-fourier-transform}
\transf{F}(\transf{F}(\sigma))(t) & = & \sigma(-t)
\end{eqnarray}
The Fourier transform of the Fourier transform of a signal $\sigma$ is
the same signal played backwards in time. The inverse of the Fourier
transform is defined just like the forward transform, except running
the helical patterns backwards in time.
\begin{eqnarray}
\label{equ:inverse-fourier-transform}
\transf{F}^{-1}(\psi)(t) & = &
  \int_{-\infty}^{\infty}\psi(f)\conj{e^{-\si 2\pi ft}}df =
  \int_{-\infty}^{\infty}\psi(f)e^{\si 2\pi ft}df \\
\label{equ:inverse-transform-property}
\transf{F}^{-1}(\transf{F}(\sigma)) & = & \sigma\mbox{ for
well-behaved }\sigma
\end{eqnarray}
Equation~\ref{equ:inverse-transform-property} fails for certain weird
functions, but all of the sound signals that interest us are well
behaved (see \cite{fourier-analysis} for a characterization of the
well-behaved functions).

\subsection{Discrete Spectra}
\label{sec:discrete-spectra}

Ironically, for precisely the simple case of a signal that is the sum
of helixes at discrete frequencies, the Fourier transform is
ill-defined over conventional functions from reals to complex values.
Consider the simplest case of a pure helical signal:
$$\transf{F}([t]e^{-\si 2\pi Ft})(f)=
\int_{-\infty}^{\infty}e^{\si 2\pi Ft}e^{-\si 2\pi ft}dt=
\int_{\infty}^{\infty}e^{\si 2\pi(F-f)t}$$
For $f\neq F$, the product
$e^{\si 2\pi Ft}e^{-\si 2\pi ft}=e^{\si 2\pi(F-f)t}$ oscillates, and
its integral is $0$. But, for $f=F$, we integrate
$e^{\si 2\pi Ft}e^{-\si 2\pi Ft}=e^{\si 2\pi(F-F)t}=e^0=1$,
so the integral is infinite. Similarly, the Fourier transform of a sum
of helixes is $0$ except at the frequencies of the helixes, where it
is infinite. The fact that the integral is infinite tells us that
there exists a component at the given frequency, but we lose all
information about the magnitude of the component.

\subsubsection{Generalized Functions}
\label{sec:generalized-functions}

In order to give more informative values for the Fourier transform of
a signal with components at discrete frequencies, we leave the
defining formula of Equation~\ref{equ:fourier-transform} alone, but we
reinterpret the nature of the functions that it may evaluate to, and
the rules of calculus for evaluating it. These changes do not affect
the well-defined values given by the usual integral calculus, but they
provide specific values in some cases where the usual integral
calculus is ill defined (consider the analogy to complex arithmetic,
which extends real arithmetic to provide a value for $\sqrt{-1}$,
which is undefined in the reals). The basic idea is that we want to
let $\transf{F}([t]e^{\si 2\pi Ft})$ represent a function
whose integral is $1$, but which has the value $0$ everywhere except
at $F$. There is no such function in the conventional calculus, but
just as the real number system may be extended with the new value
$\ii$ with the property $\ii^2=-1$, the system of functions from a
real number to a complex value may be extended with a new function
$\delta$, called the {\em Dirac\/} function, or the {\em impulse\/}
function, with the properties
\begin{eqnarray}
\delta(f) & = & 0\mbox{ for }f\neq 0 \\
\label{equ:impulse-integral}
\int_{-\infty}^{\infty}\delta(f)df & = & 1
\end{eqnarray}
Intuitively, $\delta$ is a function that has value $0$ except for a
spike at input $0$. The spike is infinitesimally narrow, and
infinitely high, so that the area inside it is $1$. A more careful
development defines {\em generalized functions\/} to be certain
infinite sequences of conventional functions, and considers only the
properties of generalized functions in the limit. $\delta$ is formally
an infinite sequence of narrower and higher spikes, all with area $1$.
Notice that to place the spike at $F$ instead of $0$ we merely shift
the input to $\delta$, in the generalized function $[
f]\delta(f-F)$. To increase the area under the spike to $R$, we
multiply $R\delta$.

Now, we have a sensible value for the Fourier transform of a helix
\begin{eqnarray}
\transf{F}([t]e^{\si 2\pi Ft})(f) & = & \delta(f-F)
\end{eqnarray}
For a helix with arbitrary amplitude and phase, the amplitude and
phase pass through to the Fourier transform at the helix frequency:
\begin{eqnarray}
\transf{F}([t]Re^{\si(P+2\pi Ft)})(f) & = & R\delta(f-F)e^{\si P}
\end{eqnarray}
It is often tempting to think that $\delta(0)=1$, but this is not quite
right. In order to yield the integral value $1$ in
Equation~\ref{equ:impulse-integral}, $\delta(0)$ must be bigger than
every conventional real number. But, it is not merely $\infty$, since it
is twice as big as $\delta(0)/2$, half as big as $2\delta(0)$, etc., and
it has a well defined angle as a complex number. Just as the new number
$\ii$ was introduced to denote $\sqrt{-1}$, we may use
$\Delta=\delta(0)$ when it is convenient to denote the value of the
impulse function at its spike.

Also, unlike the discontinuous real-valued function that has value $1$
at $0$, and value $0$ everywhere els, $\delta$ should be understood as
continuous, and continuously differentiable, even at $0$. Intuitively,
there is a continuous connection from the $0$ values to the infinite
value, infinitesimally close to the $0$ input. The derivative of
$\delta$ is an even more peculiar function that has the value $0$ at
every real input, but goes infinite infinitesimally before input $0$,
and negatively infinite infinitesimally after input $0$.

Now, consider the sum of helixes from the beginning of this section:
$$\sigma(t)=R_1e^{\si(P_1+2\pi F_1t)}+R_2e^{\si(P_2+2\pi F_2t)}+\cdots$$
The Fourier transform of $\sigma$ is just a sum of shifted impulse
functions:
$$\transf{F}(\sigma)(f)=R_1\delta(f-F_1)e^{\si P_1}+
  R_2\delta(f-F_2)e^{\si P_2}+\cdots$$
That is, $\transf{F}(\sigma)(f)=0$ for $f\neq F_1,F_2,\ldots$, and
$\transf{F}(\sigma)(F_i)=\Delta R_ie^{\si P_i}$. The location of the
nonzero values in $\transf{F}(\sigma)$ shows the frequencies in the
spectrum of $\sigma$, the magnitude
$\abs{\transf{F}(\sigma)(f)/\Delta}$ gives the amplitude of the
component at frequency $f$, and the angle
$\arg(\transf{F}(\sigma)(f)/\Delta)$ gives its phase.

There is a \emph{lot} more to the theory of generalized functions
than I have even hinted here. There are generalized functions that
cannot be defined from conventional functions plus $\delta$---their
values may not therefore be expressed in terms of $\Delta$. In
addition to introducing a particular sort of infinite values,
generalized functions allow much of the useful qualities of continuous
functions to be associated with mappings to values that are
discontinuous in the conventional sense. That is, generalized
functions may connect distant values in an infinitesimal range of
inputs. So, the {\em unit step,} or {\em Heaviside\/} function $H$
that has value $0$ on negative inputs, value $\frac{1}{2}$ at $0$, and
value $1$ on positive inputs, may be regarded as connecting from $0$
to $1$ in the infinitesimal region around $0$, as the intuitive
picture of $H$ shows in Figure~\ref{fig:heaviside}. As long as the
infinitesimally narrow connections work in the most obvious way, I
omit them from the definition of a function.
%
\begin{eqnarray*}
H(x) & = &
  {\renewcommand{\arraystretch}{1.25}
  \left\{\begin{array}{ll}
    0 & \mbox{if }x<0 \\
    \frac{1}{2} & \mbox{if }x=0 \\
    1 & \mbox{if }x>0
    \end{array}
  \right.}
\end{eqnarray*}
%
The impulse function $\delta$ also contains an instantaneous
connection from $0$ to $\Delta$ and back again to $0$ in the
infinitesimal region around $0$.
%
\begin{eqnarray*}
\delta(x) & = & 
  \left\{\begin{array}{ll}
    0 & \mbox{if }x<0 \\
    \Delta & \mbox{if }x=0 \\
    0 & \mbox{if }x>0
  \end{array}
  \right.
\end{eqnarray*}

\subsection{Continuous Spectra and Noise}

The putative advantage of the Fourier transform is to connect
arbitrary sound signals to their spectra. From
Section~\ref{sec:discrete-spectra}, we see that the Fourier transform
discovers the spectrum of a sum of helixes at discrete frequencies,
but we should also be able to use the Fourier transform to analyze
signals with components spread continuously across some range of
frequencies. Since we do not know in advance what such signals are
like in the time domain, it makes sense to define a continuous
spectrum, and then apply the inverse Fourier transform to get a sound
signal. For example, the spectrum given by the Heaviside impulse
function $H$ seems like a sensible representation for the spectrum of
a signal with components at all frequencies, all with equal amplitude
and phase $0$. But, the inverse Fourier transform yields
%
\begin{eqnarray*}
\transf{F}^{-1}(H)(t)
  & = & \int_{-\infty}^{\infty}H(f)
          e^{\si 2\pi ft}df \\
  & = & \int_{0}^{\infty}e^{\si 2\pi ft}df \\
  & = & \mbox{missing steps} \\
  & = & \delta(t)/2+\ii/(2\pi t)
\end{eqnarray*}
%
This signal is not perceived as a steady-state sound: rather it is an
infinitely sharp sort of click. The problem is that the different
frequency components cancel out in a systematic way, instead of
presenting a steady sound.

One might object to the spectrum $H$ because it has equal sized
components spread over an infinite range of frequencies, and therefore
appears to represent an infinite total amount of sound. But, the
finite continuous spectrum with frequencies over the range $[1,2]$ has
similar problems.
%
\begin{eqnarray*}
\Pi_{1,2}(f) & = &
  {\renewcommand{\arraystretch}{1.25}
  \left\{\begin{array}{ll}
    0 & \mbox{if }f<1 \\
    \frac{1}{2} & \mbox{if }f=1 \\
    1 & \mbox{if }1<f<2 \\
    \frac{1}{2} & \mbox{if }f=2 \\
    0 & \mbox{if }f>2
  \end{array}
  \right.}
\end{eqnarray*}
%
Intuitively, a finite amount of sound at a discrete frequency $F$ is
represented by an infinite value $\Delta x$ in the spectrum at $F$.
So, $\Pi_{1,2}$ allots only an infinitesimal amount of sound at each
frequency in the range $[1,2]$, and the total amount of sound is
finite. But, the inverse Fourier transform yields:
%
\begin{eqnarray*}
\transf{F}^{-1}(\Pi_{1,2})(t)
  & = & \int_{-\infty}^{\infty}\Pi_{1,2}(f)
          e^{\si 2\pi ft}df \\
  & = & \int_{1}^{2}e^{\si 2\pi ft}df \\
  & = & (e^{\si 2\pi 2t}-e^{\si 2\pi t})/(\ii 2\pi t) \\
  & = & (e^{\si(-\pi/2+2\pi 2t)}-e^{\si(-\pi/2+2\pi t)})/(2\pi t) \\
  & = & \mbox{missing steps} \\
  & = & \sin(\pi t)e^{-\si 2\pi(3/2)t}/(\pi t)
\end{eqnarray*}
%
That is, $\transf{F}^{-1}(\Pi_{1,2})$ looks like a helix at frequency
$3/2$, whose amplitude oscillates and dies out by multiplication with
$\sin(\pi t)/(\pi t)$. Such a signal is not perceived as a
steady-state sound, but rather as a sound that rises from silence to a
maximum loudness at $0$ and then dies out again.  Again, the
continuous spread of frequency components with synchronized phases
produces a strange sort of systematic cancellation.

\subsubsection{Random Functions}

In order to give the perception of steady-state sound with components
spread over a continuous range of frequencies, we need to use
sound signals whose values vary randomly at each point in the time
domain. Let
\begin{eqnarray}
Z(x) & = & e^{-x^2/2}/\sqrt{2\pi}
\end{eqnarray}
and notice that
\begin{eqnarray}
\int_{-\infty}^{\infty}Z(x)dx & = & 1
\end{eqnarray}
$Z$ is called the {\em Gaussian function,} or the {\em normal
probability density function,} and it has the shape of a bell (see
Figure~\ref{fig:gaussian}). $Z$ is particularly convenient
mathematically because many probability calculations starting with
Gaussian functions produce scaled versions of Gaussian functions at
the end. In particular, the sum of an infinite sequence of independent
random variables with a Gaussian density is itself a random variable
with a Gaussian density.

Let $r$ be a real-valued function, such that for each input $t$,
$r(t)$ is a random value with Gaussian density, independent of all
other values of $r$. And, let $p$ be another real-valued function such
that for each $t$, $p(t)$ is a uniformly-distributed value in the
range $[0,2\pi)$, independent of all other values of $p$ and all
values of $r$.  Now, define the complex-valued sound signal in the
time domain $\rho$ by
\begin{eqnarray}
\rho & = & re^{\si p}
\end{eqnarray}
Each angle $\arg(\rho(t))$ is a random value uniformly distributed
over the range $[0,2\pi)$, each $\abs{\rho(t)}$ is a random positive
real value distributed with density $[x]2Z(x)$, and all of
these random values are independent of one another. Unlike the
generalized functions, random functions such as $\rho$ are legitimate
functions according to the conventional mathematical definition, but
they are unusual in that they are discontinuous everywhere. Imagine
the graph of $\rho$ as an infinite bristle brush, of the sort where
bristles stick out in a full circle about the handle, which is the $t$
axis of the graph. {\bf (As generalized functions, do the $\rho$s
connect each value back to $0$, or to ``adjacent'' values? I'm not
sure at the moment. Later: certainly not to $0$, but the exact sense
in which adjacent values connect is tricky. The key is to get a
sensible derivative.)} Although random functions are mathematical
functions in the conventional sense, our definitions of random
functions, and our notation for them, are somewhat odd. When we write
$\rho$, we are referring to any one of the infinitely many functions
with the random properties described above, but within one discussion,
all instances of the symbol $\rho$ refer to the same random function.
If we attach different subscripts to $\rho$, such as
$\rho_1,\rho_2,\ldots$, then we are referring to different functions
with the same random properties (independent?).

The marvellous thing about the signal $\rho$ is that its Fourier
transform is another function with the same random properties:
\begin{eqnarray}
\transf{F}(\rho_1)(f) & = &
  \int_{-\infty}^{\infty}\rho_1(t)e^{-\si 2\pi ft}dt =
  \rho_2(f)\mbox{ with probability }1
\end{eqnarray}
The reason for this similarity is that, for each $f$, $[
t]\rho_1(t)e^{-\si 2\pi ft}=\rho_{1,f}(t)$ is another Gaussian-random
function, and then the integral gives yet another random function with
Gaussian density. The independence of each pair of values
$\transf{F}(\rho_1)(F_1)$, $\transf{F}(\rho_1)(F_2)$ for $F_1\neq F_2$
holds because the functions $e^{-\si 2\pi F_1t}$ and $e^{-\si 2\pi
F_2t}$ are linearly independent. So, the spectrum of a Gaussian-random
sound signal $\rho_1$ is given by another Gaussian-random function
$\rho_2$. $\rho_1$ is perceived as a steady-state sound, and $\rho_2$
represents essentially a uniform spread of sound over all
frequencies. The randomization of the precise values of the spectrum
avoids the systematic cancellation in the signals with spectra $H$ and
$\Pi_{1,2}$.

The spectrum $\rho_2$ above has the same qualitative behavior over
negative frequencies as over positive frequencies. In order to have a
steady-state sound signal with only positive-frequency components,
define the spectrum by
\begin{eqnarray}
\rho^{+}(f) & = & \left\{\begin{array}{ll}
                      0 & \mbox{if }f<0 \\
                      \rho(f) & \mbox{if }f\geq 0
                    \end{array}
                  \right.
\end{eqnarray}
The inverse Fourier transform of $\rho^{+}$ is a random function with
the same density as $\rho$.
\begin{eqnarray}
\transf{F}^{-1}(\rho^{+}_{1})(t) & = &
  \int_{-\infty}^{\infty}\rho^{+}_{1}(t)e^{-\si 2\pi ft}dt =
  \rho^{\oplus}(t) = \rho_2(t)\mbox{ }\nonumber\\*
 & & \mbox{with probability }1
\end{eqnarray}
It is strange that $\rho$ and $\rho^{+}$ seem to have the same inverse
Fourier transform. In fact, they do not. They both have inverse
transforms that are random functions of the same density, but
$\transf{F}^{-1}(\rho^{+})$ inhabits a small, probability-$0$,
subspace of the space of all such functions. $\rho^{\oplus}$ denotes a
function in this subspace.
\begin{eqnarray}
\rho^{\oplus} & = & \transf{F}^{-1}(\rho^{+})
\end{eqnarray}
Every $\rho^{\oplus}$ is also a $\rho$, but a randomly chosen $\rho$
has probability $0$ of being a $\rho^{\oplus}$.

{\tiny\bf Classical probability theory based on measure is not really
the right foundation for this discussion. Each particular signal
either does or does not sound like white noise. What determines the
sound is not the process that produced the signal, but specific
statistics that must agree with the statistics of a random process.}

Steady-state signals with spectra, such as $\rho$ and $\rho^{+}$, that
distribute sound evenly across all frequencies or all positive
frequencies are called {\em white noise,} by analogy to the perception
of light with an even mixture of all fequencies as the color white.
Lots of other randomized functions behave as white noise. In
particular, either the phase or the magnitude of the signal may be
constant or deterministic as long as the other one is appropriately
randomized. For example, real-valued functions with random values in
uniform or Gaussian density, and complex-valued functions with
constand magnitude and random angle, both generate forms of white
noise.

\subsection{A Calculus of Fourier Transforms}

In order to apply the Fourier transform to generate useful insight, we
need some rules for deriving Fourier transforms of interesting
functions. In principle, we may use the definition of Fourier
transform in Equation~\ref{equ:fourier-transform}, and apply the rules
of the integral calculus. In fact, that is far too cumbersome, and we
need to manipulate functions and their transforms at a much higher
level. The best thing is to forget the meaningful application of the
transform for a while, and just systematically absorb some useful
functions and operators on functions and the rules for their
interactions with the transform.

\subsubsection{Basic Functions and Their Transforms}

Many of the basic functions turn out to be useful in both the time and
frequency domains, so I define them in terms of a generic variable
$x$. For a systematic presentation, I repeat some functions that we
are already familiar with.
\begin{eqnarray}
\constfun{1}(x) & = & 1 \\
\helix(x) & = & e^{\si2\pi x} \\
\delta(x) & = &
  \left\{\begin{array}{ll}
    0 & \mbox{if }x<0 \\
    \Delta & \mbox{if }x=0 \\
    0 & \mbox{if }x>0
    \end{array}
  \right. \\
\shah(x) & = & \sum_{i=-\infty}^{\infty}\delta(x-i) \\
\click(x) & = & \delta(x)/2+\ii/(2\pi x) \\
H(x) & = &
  {\renewcommand{\arraystretch}{1.25}
  \left\{\begin{array}{ll}
    0 & \mbox{if }x<0 \\
    \frac{1}{2} & \mbox{if }x=0 \\
    1 & \mbox{if }x>0
    \end{array}
  \right.} \\
Z(x) & = & e^{-x^2/2}/\sqrt{2\pi} \\
\Pi(x) & = &
  {\renewcommand{\arraystretch}{1.25}
  \left\{\begin{array}{ll}
    0 & \mbox{if }x<-1/2 \\
    \frac{1}{2} & \mbox{if }x=-\frac{1}{2} \\
    1 & \mbox{if }-1/2<x<1/2 \\
    \frac{1}{2} & \mbox{if }x=-\frac{1}{2} \\
    0 & \mbox{if }x>1/2
    \end{array}
  \right.} \\
\sinc(x) & = & \sin(\pi x)/(\pi x) \\
\rho & = & re^{\si p}
\mbox{ for Gaussian random }r\mbox{, uniform random }p \\
\rho^{+} & = & H\rho \\
\rho^{\oplus} & = & \transf{F}^{-1}(\rho^{+})
\end{eqnarray}
Now, the transforms of the basic functions:
%
\begin{eqnarray}
\transf{F}(\constfun{1}) & = & \delta \\
\transf{F}(\helix) & = & [s]\delta(s-1) \\
\transf{F}(\delta) & = & \constfun{1} \\
\transf{F}(\shah) & = & \shah \\
\transf{F}(\click) & = & H \\
\transf{F}(H) & = & \conj{\click} \\
\transf{F}(Z) & = & Z \\
\transf{F}(\Pi) & = & \sinc \\
\transf{F}(\sinc) & = & \Pi \\
\transf{F}(\rho_1) & = & \rho_2 \\
\transf{F}(\rho^{+}) & = & \rho^{\oplus} \\
\transf{F}(\rho^{\oplus}) & = & \rho^{+}
\end{eqnarray}

\subsubsection{Functional Operators}

Functions may be combined by performing a given arithmetic operation
on their corresponding values. Cross correllation ($\star$) and
convolution ($\ast$) operate on the entirety of two functions with an
integral formula to produce a new function.
\begin{eqnarray}
(\alpha+\beta)(x) & = & \alpha(x)+\beta(x) \\
(\alpha-\beta)(x) & = & \alpha(x)-\beta(x) \\
(\alpha\beta)(x) & = & \alpha(x)\beta(x) \\
(\alpha/\beta)(x) & = & \alpha(x)/\beta(x) \\
(\beta^\alpha)(x) & = & \beta(x)^{\alpha(x)} \\
\conj{\alpha}(x) & = & \conj{\alpha(x)} \\
(\alpha\star\beta)(x) & = &
  \int_{-\infty}^{\infty}\conj{\alpha(y-x)}\beta(y)dy =
  \int_{-\infty}^{\infty}\conj{\alpha(y)}\beta(x+y)dy \\
(\alpha\ast\beta)(x) & = &
  \int_{-\infty}^{\infty}\alpha(y)\beta(x-y)dy =
  \int_{-\infty}^{\infty}\alpha(y+x)\beta(-y)dy
\end{eqnarray}
Now, consider some important properties of, and relations between, these
functional operators.
\begin{eqnarray}
\delta\ast\alpha & = & \alpha \\
\alpha\ast\beta & = & \beta\ast\alpha \\
(\alpha\ast\beta)\ast\gamma & = & \alpha\ast(\beta\ast\gamma) \\
\alpha\ast(\beta+\gamma) & = & (\alpha\ast\beta)+(\alpha\ast\gamma) \\
\delta\star\alpha & = & \alpha \\
\alpha\star\beta & = & \conj{\beta}\star([x]\alpha(-x)) \\
(\alpha\star\beta)\star\gamma & = & \alpha\star(\beta\star\gamma) \\
\alpha\star(\beta+\gamma) & = & (\alpha\star\beta)+(\alpha\star\gamma) \\
\alpha\star\beta & = & ([x]\conj{\alpha}(-x))\ast\beta =
  \conj{\alpha}\ast([x]\beta(-x)) \\
\alpha\ast\beta & = & ([x]\conj{\alpha}(-x))\star\beta =
  \conj{\alpha}\star([x]\beta(-x)) \\
(\alpha\ast\beta)' & = & \alpha'\ast\beta \\
(\alpha\ast\beta)' & = & \alpha\ast\beta' \\
\delta\alpha & = & (\alpha(0))\delta \\
([x]\delta(x-A))\ast\alpha & = & [x]\alpha(x-A) \\
([x]\delta(x-A))\star\alpha & = & [x]\alpha(x-A) \\
([x]\delta(x-A))\alpha & = & \alpha(A)\delta
\end{eqnarray}
To calculate Fourier transforms, use the following rules that show how
the transform and other functional operators interact.
%
\begin{eqnarray}
\transf{F}(\alpha+\beta) & = & \transf{F}(\alpha)+\transf{F}(\beta) \\
\transf{F}(\conj{\alpha}) & = & [f]\conj{\transf{F}(\alpha)(-f)} \\
\transf{F}(\cconst{\alpha}\beta) & = & \cconst{\alpha}\transf{F}(\beta)
  \mbox{ for constant }\cconst{\alpha} \\
\transf{F}([t]\beta(t-A)) & = &
  \transf{F}(\beta)([f]\helix(Af)) =
  \transf{F}(\beta)([f]e^{\si 2\pi Af}) \nonumber \\*
  & & \mbox{for real constant }A \\
\transf{F}([t]\beta(At)) & = &
  ([f]\transf{F}(\beta)(f/A))/\abs{A} \mbox{ for real constant }A \\
\transf{F}([t]\beta(-t)) & = & [f]\transf{F}(\beta)(-f) \\
\label{equ:convolution-theorem}
\transf{F}(\alpha\ast\beta) & = & \transf{F}(\alpha)\transf{F}(\beta) \\
\label{equ:conv-thm-conv}
\transf{F}(\alpha\beta) & = & \transf{F}(\alpha)\ast\transf{F}(\beta) \\
\transf{F}(\alpha\star\beta) & = &
  -\conj{[f]\transf{F}(\alpha)(-f)}\transf{F}(\beta) \\
\transf{F}(\alpha([t]e^{\si 2\pi Ft})) & = &
  [f]\transf{F}(\alpha)(f-F) \\
\label{equ:autocorrellation}
\transf{F}(\alpha\star\alpha) & = & \abs{\transf{F}(\alpha)}^2 \\
\transf{F}(\alpha') & = & [f]\ii 2\pi f\transf{F}(\alpha)(f)
\end{eqnarray}

\section{The Meaning of Convolution and Multiplication}

Table~\ref{tab:conv-mult} shows how various signal-processing
operations can be expressed as various multiplications and
convolutions in the time and frequency domains. In each case, the
subscript $t$ indicates a sound signal in the time domain, and the
subscript $f$ indicates its Fourier transform in the frequency domain.
In the first five cases, the correspondence of the multiplicative
formula in one domain to the signal-processing operation is
intuitively clear---the appropriateness of the convolution formula
follows from Equations~\ref{equ:convolution-theorem}
and~\ref{equ:conv-thm-conv}. In the last case (time-shifting), the
convolution formula in the time domain is intuitively clear, and the
multiplication formula in the frequency domain is derived from it.

\begin{table}[bp]
\begin{tabular}{rcc}
Operation & {\bf Time Domain} & {\bf Frequency Domain} \\ \hline \\
Amplitude modulation & $A_t\beta_t$ & $A_f\ast\beta_f$ \\
Amplitude/phase modulation & $\alpha_t\beta_t$ & $\alpha_f\ast\beta_f$ \\
Sampling & $\shah\beta_t$ & $\shah\ast\beta_f$ \\
Filtering & $A_t\ast\beta_t$ & $A_f\beta_f$ \\
Filtering/phase-shifting & $\alpha_t\ast\beta_t$ & $\alpha_f\beta_f$ \\
Time-shifting & $([t]\delta(t-A))\ast\beta_t$ &
  $([f]e^{\si 2\pi Af})\beta_f$
\end{tabular}
\caption{\label{tab:conv-mult} The meanings of convolutions and
multiplications of signals.}
\end{table}

The Fourier transform provides a number of useful insights into the
nature of steady-state sound. Amplitude modulation is seen to
introduce frequency components at the sums and differences of
components in the carrier and the modulator, since it translates from
multiplication in the time domain to convolution in the frequency
domain. Filtering to boost certain frequencies in relation to others
clearly corresponds to multiplication in the frequency domain, so it
may be implemented by convolution in the time domain. Since sampling
in the time domain is essentially multiplication by $\shah$, it has
the effect of convolution with $\shah$ in the frequency domain, and
the replication of frequency components at regular intervals resulting
from convolution with $\shah$ is precisely aliasing. Finally, notice
how Equation~\ref{equ:autocorrellation} tells us that the relative
magnitudes of components in the frequency spectrum
$\transf{F}(\sigma)$ depend only on the statistical correllation of
$\sigma$ with the time-shifted versions $[t]\sigma(t-A)$. This
autocorrellation property is what allows random functions to generate
continuous spreads of frequencies, at the cost of randomizing the
phases of the frequency components.

\chapter{Additive Spectral Synthesis}

\section{Steady-State Sound}

\section{Amplitude Modulation, Enveloping}

\subsection{Enveloping a Multifrequency Sound}

\subsection{Enveloping Individual Spectral Components}

\section{Frequency Modulation}

\chapter{Time-Varying Spectral Analysis}
\label{cha:time-varying-spectrum}

\begin{itemize}
\item Dual use of time---sonic time vs.\ variation time.
\end{itemize}

\section{Shortcomings of the Fourier Transform}

\begin{itemize}
\item Implications for sampling rate.
\end{itemize}

\section{Time-Varying Spectral Analysis with the Continuous Wavelet Transform}

\begin{itemize}
\item Windowing.
\item Window shape.
\item Constant $Q$.
\item Causality.
\item Phase vocorder.
\end{itemize}

\chapter{Synthesis by Resonant Modes}

\begin{itemize}
\item Shortcomings of additive synthesis w.r.t.\ simplicity, transformability.
\end{itemize}

\section{Filters}

In general, a filter is just something that converts an input signal
to an output signal: that is, a function $\filter{T}$ from signals to
signals. Most of filter theory deals with restricted classes of
filters with qualities that are physically reasonable and/or
convenient for analysis. The important restricted classes of filters
include:
%
\begin{description}
%
\item[Time-invariant.] Although we express time as a real-valued
  parameter $t$, it is usually unrealistic to attach any special
  significance to time $t=0$, nor to any other particular value of
  $t$. Normally, only the time \emph{differences} between events have
  perceptual significance in a sound signal. A \emph{time-invariant}
  filter is one whose output at time $t$ does not depend on the
  specific value of $t$, but only on the time difference between $t$
  and various events in the input signal. The output of a
  time-invariant filter normally varies over time, but that variation
  depends entirely on input variation, not on the value of $t$. That
  is, the filter has no internal clock, and no access to information
  other than the input signal. Technically, this means that if the
  input is shifted in time, the output gets shifted in exactly the
  same way.

  Strictly speaking, if we change the behavior of a filter by
  manipulating some knobs, sliders, or other controls, then the filter
  is not time-invariant, since its response to an input signal depends
  on which parts of the signal arrive before our adjustmets and whcih
  parts arrive afterwards. For example, the tuning circuitry on a
  radio is not a time-invariant filter when we turn the dial. But, we
  normally understand such an adjustable filter as a parameterized
  sequence of time-invariant filters corresponding to the different
  control settings. This view works well enough as long as we are
  willing to ignore transient behavior close to the times when the
  controls change.
%
\item[Causal.] A \emph{causal} filter is one whose output at time $t$
  depends only on the input at times $\leq t$. That is, present output
  depends on past and present, but not future, input. In the strictest
  sense, only causal filters are physically realizable when both input
  and output are signals in the time domain. But, it may be convenient
  to view a causal filter as an approximation to a mathematically
  simpler noncausal filter. Also, it is sometimes convenient to
  callibrate input and output times to different $0$ points, in which
  case a filter that is physically causal may be modelled by one that
  appears to look into the future for a limited interval. Causality is
  largely irrelevant when we filter signals that are arranged in space
  rather than time. Noncausal filtering of sound signals in real time
  is impossible, but noncausal filters are perfectly feasible, and
  quite useful, for image processing and off-line sound processing.
%
\item[Finite Impulse Response.] A filter has \emph{finite impulse
  response} if, whenever the input becomes $0$ and remains $0$
  forever, the output eventually becomes and remains $0$.
%
\item[Memoryless.] A filter is \emph{memoryless} if the output at time
  $t$ depends only on the input and its derivatives at time $t$---not
  on the future nor past behavior of the input signal.
%
\item[Pointwise.] A filter is \emph{pointwise} if the output at time
  $t$ depends only on the input \emph{value} at time $t$---no memory
  and no derivatives.
%
\item[Stable.] A \emph{stable} filter is one whose output cannot run
  off to infinity without some sort of infinite input stimulus.
  Physically realizable filters are always stable in some sense, but
  it may be convenient to model a situation where the filter gets
  destroyed as an instability. There are many possible variants of
  stability depending on the precise physical interpretation of
  signals, and on the limitations imposed on inputs. In these notes, I
  require that the output of a stable filter cannot go off to infinity
  unless the input does so. Physically realizable filters often
  satisfy stronger restrictions, such as conservation of energy.
%
\item[Linear.] A \emph{linear} filter is one for which sums and
  scalings of input produce corresponding sums and scalings of
  outputs. Linear filters are \emph{much} easier to analyze than
  general nonlinear filters, since we may break the input up into
  helical components, filter each component, and reconstruct the
  output from the filtered components. We have a thorough mathematical
  explanation of linear filters, and very little mathematical
  information about nonlinear ones. Physically realizable filters are
  never precisely linear, but they are often very nearly linear within
  reasonable limits on the input signal. In such cases, it is
  \emph{very} helpful to consider the linear approximation to a
  filter, instead of a more exact but less analyzable form.
%
\end{description}
%
Each restrictions on filters may be expressed precisely as a
mathematical property of a filter $\filter{T}$:
%
\begin{description}
%
\item[Time-invariant:]
  $\filter{T}([s]\sigma(s+A))(t)=\filter{T}(\sigma)(t+A)$
  for all signals $\sigma$, real constants $A$.
%
\item[Causal:]
  $\filter{T}(\sigma)(t)=\filter{T}(\sigma[s]H(t-s))(t)$.
%
\item[Finite Impulse Response:] For all signals $\sigma$, real
  constants $A$, there is a real constant $B$ such that
  $\filter{T}([s](H(s-A))*\sigma)(t)=0$ for all $t>B$.
%
\item[Memoryless:] $\filter{T}(\sigma)(t)=
    f(\sigma(t),\frac{d}{dt}\sigma(t),\frac{d^2}{dt^2}\sigma(t),\ldots)$
  for some function $f$.
%
\item[Pointwise:] $\filter{T}(\sigma)(t)=f(\sigma(t))$ for some function $f$.
%
\item[Stable:] For all signals $\sigma$, if there is a real constant
  $A$ such that $\abs{\sigma(t)}<A$ for all $t$, then there is another
  real constant $B$ such that $\abs{\filter{T}(\sigma)(t)}<B$ for all
  $t$.
%
\item[Linear:]
  $\filter{T}(\cconst{\gamma}\sigma+\cconst{\delta}\beta)=
    \cconst{\gamma}\filter{T}(\sigma)+\cconst{\delta}\filter{T}(\beta)$
  for all time signals $\sigma$ and $\beta$, complex constants
  $\cconst{\gamma}$, $\cconst{\delta}$.
%
\end{description}
%
In this section, I review the theory of linear, time-invariant
filters. Nonlinear filters are very important in sound modelling, but
I cannot find a general theory for them.

\subsection{Unit Resonances and Antiresonances}

Just as sound signals may be decomposed into simple helices, linear
time-invariant filters may be composed into simple resonances, which
boost signals near a certain frequency, and antiresonances, which
suppress signals near a certain frequency.

\subsubsection{Unit Resonances}

The \emph{unit resonance} filter is essentially the ideal spring
system of Section~\ref{sec:springs}, with an idealized frictional
force opposing the spring tension. The tendency of a spring to vibrate
at a particular frequency produces the desired resonance---in fact a
frictionless spring, once moved away from $0$, vibrates forever at its
natural frequency. Friction is needed to keep the system stable when
it is stimulated by an input signal. Because of the form of analysis
described in Section~\ref{sec:laplace}, this type of basic filter is
also called a \emph{single-pole filter}. Since current behavior
depends only on present and past behavior, it is causal. But, it is
neither memoryless, nor finite-impulse response.

Recall Equation~\ref{equ:rotordiff} from
Section~\ref{sec:complex-arith}, the complex-number form of the
differential equation for a rotor:
%
\begin{eqnarray*}
  \rho' & = & \ii A\rho
\end{eqnarray*}
%
The real constant $A$ determines the frequency of the rotor;
$\ii A\rho$ is a vector perpendicular to $\rho$ with length scaled by
$A$, so $\ii A\rho$ represents a motion around a circle. Idealized
friction causes the rotor to decay toward $0$, by adding a
displacement in the opposite direction to $\rho$, and of proportional
length. So, the differential equation for a decaying rotor is
%
\begin{eqnarray}
\label{equ:rotordecdiffba}
  \rho' & = & (B+\ii A)\rho
\end{eqnarray}
%
with $B<0$. If $B=0$, then this reduces to the frictionless rotor. For
$B>0$, we get a rotor with antifriction, which is rather
explosive. We might as well take full advantage of complex arithmetic,
and rewrite Equation~\ref{equ:rotordecdiffba} as
%
\begin{eqnarray}
\label{equ:rotordecdiff}
  \rho' & = & \cconst{\alpha}\rho
\end{eqnarray}
%
where $\Re(\cconst{\alpha})<0$ and $\Im(\cconst{\alpha})>0$.

With initial condition $\rho(0)=1$, the solution to
Equation~\ref{equ:rotordecdiff} is $\rho(t)=e^{\cconst{\alpha} t}$, just as if
$\cconst{\alpha}$ were real. Using Euler's formula (Equation~\ref{equ:euler}),
this is
$$\rho(t)=e^{\cconst{\alpha}t}=
  e^{\Re(\cconst{\alpha})t}e^{\si\Im(\cconst{\alpha})t}=
  e^{\Re(\cconst{\alpha})t}
    (\cos(\Im(\cconst{\alpha})t)+\ii\sin(\Im(\cconst{\alpha})t))$$
So, the behavior of a decaying rotor, when released at $\rho(0)=1$, is
to follow a spiral that turns at $\Im(\cconst{\alpha})/(2\pi)$ cycles
per unit time, and decays exponentially with amplitude
$e^{\Re(\cconst{\alpha})t}$ at time $t$.

To use a decaying rotor as a unit resonance, let the input to
the filter contribute an additional component to the rotor derivative,
and take the output from the rotor state. Let
$\filter{R}_{\cconst{\alpha}}$ be the filter constructed from the
decaying rotor with constant $\cconst{\alpha}$. Then
$\filter{R}_{\cconst{\alpha}}(\sigma)=\rho$, where
%
\begin{eqnarray}
\label{equ:rotorfilter}
  \rho' & = & \sigma+\cconst{\alpha}\rho
\end{eqnarray}
%
This filter resonates to frequencies near
$\Im(\cconst{\alpha})/(2\pi)$, and the strength of the resonance
decreases as $\Re(\cconst{\alpha})$ decreases.

First, consider the behavior of $\filter{R}_{\cconst{\alpha}}$ when
the input is a unit helix at the resonant frequency, that is
$\filter{R}_{\cconst{\alpha}}(\sigma)$ where
$\sigma(t)=e^{\si\Im(\cconst{\alpha})t}$. Assume that the input
continues for all time, so that the rotor configuration reaches some
sort of steady state. In that steady state, the rotor state and the
input have exactly the same angle (they are precisely in phase), and
the input cancels the decay component of the rotor equation, so that
the rotor state is also a helix at frequency
$\Im(\cconst{\alpha})/(2\pi)$, but with possibly a different amplitude
than the input. Look at Figure (see Maple manuscript) for the
relationship between the input and rotor state.

The picture in Figure (Maple manuscript) is probably the best tool for
understanding the essential quality of the rotor filter with input at
its resonant frequency. But, if you really want to work through the
mathematics (which is elementary, but icky), here it is.  Since $\rho$
has the same frequency and phase as the input, but some unknown
amplitude $A$, we may write $\rho(t)=Ae^{\si\Im(\cconst{\alpha})t}$,
which implies that
$$A\ii\Im(\cconst{\alpha})e^{\si\Im(\cconst{\alpha})t}=\rho'=
  \sigma+\cconst{\alpha}\rho=
  e^{\si\Im(\cconst{\alpha})t}+\cconst{\alpha}Ae^{\si\Im(\cconst{\alpha})t}=
  (1+A\cconst{\alpha})e^{\si\Im(\cconst{\alpha})t}$$
Comparing the first and last terms above, we find that
$\ii A\Im(\cconst{\alpha})=(1+A\cconst{\alpha})$, so
$$0=\Re(\ii A\Im(\cconst{\alpha}))=\Re(1+A\cconst{\alpha})=
  1+A\Re(\cconst{\alpha})$$
so $A=-1/\Re(\cconst{\alpha})$. $A$ is undefined when
$\Re(\cconst{\alpha})=0$. In fact, the only way to have a steady state
with $\Re(\cconst{\alpha})=0$ is to make $\sigma=0$. The derivation
yields a steady state when $\Re(\cconst{\alpha})>0$. This steady state
is correct in a sense. It corresponds to input exactly $\frac{1}{2}$ a
cycle out of phase with the rotor state. But, this configuration is
highly unstable: the least deviation from opposite phase will send the
rotor state off to infinity. By contrast, when
$\Re(\cconst{\alpha})<0$ every initial state heads asymptotically
toward the steady state derived above.

Now, consider an input unit helix at an arbitrary frequency,
$\sigma(t)=e^{\si Bt}$. The picture of the steady state is a bit more
complicated than before. The frequency of the output must still be the
same as the frequency of the input, so that the rotor state and input
remain at the same angle. If the input frequency is greater than the
resonant frequency, then the phase of the input advances ahead of the
phase of the rotor state, so that part of the input contributes to
moving the rotor at a higher frequency than the resonance. If the
input frequency is less than the resonant frequency, then the phase of
the input falls behind the phase of the rotor state, so that part of
the input retards the rotor. As a result, there is less of the input
available to cancel the decay, and the output has a smaller amplitude
than it does at the resonant frequency. Look at Figures (Maple
manuscript) to see the relationship between the input and rotor state
when the input frequency is greater (respectively, less) than the resonant
frequency.

If you really want to see the mathematical derivation, let the output
be $\rho(t)=Ae^{\si(Bt+C)}$. Then
$$A\ii Be^{\si(Bt+C)}=\rho'=
  \sigma+\cconst{\alpha}\rho=
  e^{\si Bt}+\cconst{\alpha}Ae^{\si(Bt+C)}=
  (e^{-\si C}+A\cconst{\alpha})e^{\si(Bt+C)}$$
Comparing the first and last terms,
$\ii AB=e^{-\si C}+A\cconst{\alpha}$, so
$$0=\Re(\ii AB)=\Re(e^{-si C}+A\cconst{\alpha})=
  =-\cos(C)+A\Re(\cconst{\alpha})$$
and
$$AB=\Im(\ii AB)=\Im(e^{\si C}+A\cconst{\alpha})=
  -\sin(C)+A\Im(\cconst{\alpha})$$ From the real part,
$\cos(C)=A\Re(\cconst{\alpha})$. From the imaginary part,
$\sin(C)=A(\Im(\cconst{\alpha})-B)$, so
$C=\arctan((\Im(\cconst{\alpha})-B)/\Re(\cconst{\alpha}))$ and
$A=\cos(C)/\Re(\cconst{\alpha})=\sin(C)/(\Im(\cconst{\alpha})-B)=
  1/\abs{\cconst{\alpha}-\ii B}$.
It is easy to see that $A$ is maximum, and $C$ is $0$, when
$B=\Im(\cconst{\alpha})$. Notice also that the amplitude $A$ falls off
more rapidly the closer that $\Re(\cconst{\alpha})$ gets to $0$. When
$B>\Im(\cconst{\alpha})$, then $C<0$, which means that the input phase
is ahead of the output. When $B<\Im(\cconst{\alpha})$, then $C>0$, so
the input phase is behind the output. When $\Re(\cconst{\alpha})=0$,
and when $\Re(\cconst{\alpha})>0$, behavior is analogous to the case
where input frequency equals resonant frequency: we get no steady
state unless the input is $0$, and a steady but unstable case with
input nearly $\frac{1}{2}$ cycle out of phase with rotor state,
respectively.

One mathematical complication has been swept under the rug. If you
didn't notice it, don't worry about. Just in case you did: the
steady-state output derived above is not the only solution satisfying
the differential equation for the unit resonance. There are an
infinite number of solutions. We may select any complex number value
for $\filter{R}_{\cconst{\alpha}}(\sigma)(0)$, and there is a unique
solution with that $0$ value. The solution that makes physical sense
is the only one in which
$\lim_{t\rightarrow-\infty}\filter{R}_{\cconst{\alpha}}(\sigma)(t)$
is bounded. Intuitively, this corresponds to setting the filter state
to $0$ at time $-\infty$.

Even though you skipped the mathematical derivations, you should
review the following crucial qualitative observations.
%
\begin{itemize}
%
\item $\filter{R}_{\cconst{\alpha}}$ has a resonant frequency at
  $\Im(\cconst{\alpha})/(2\pi)$.
%
\item Set to some nonzero initial state, and left to itself, the
  decaying rotor with constant $\cconst{\alpha}$ spins at the resonant
  frequency, and its amplitude decays in the form of
  $e^{\si\Re(\cconst{\alpha})t}$.
%
\item When $\Re(\cconst{\alpha})<0$, the rotor behavior is stable.
%
\item When $\Re(\cconst{\alpha})=0$ and the resonant frequency is not
  $0$, the rotor behavior is unstable for all nonzero helical
  inputs. On zero input, it rotates at the resonant frequency without
  decay. The peculiar special case where $\cconst{\alpha}=0$ is stable
  for helical inputs with nonzero frequency.
%
\item When $\Re(\cconst{\alpha})>0$, the rotor behavior has an
  unstable equilibrium where the input is approximately opposite to
  the rotor state, and cancels the increase. Any deviation from that
  unstable equilibrium sends the rotor off to infinity.
%
\item For helical inputs, output amplitude is highest when the input
  frequency equals the resonant frequency.
%
\item Input is in phase with output at the resonant frequency. At
  higher frequencies, input phase advances ahead of output phase, at
  lower frequencies it retards.
%
\item Larger values of $\Re(\cconst{\alpha})$ (i.e., closer to $0$,
  since they should be negative) give higher amplitude of
  output, and also a sharper peak at the resonant frequency.
%
\end{itemize}
%
Although a signal at frequency $0$ is not perceptible as sound, a
filter with resonance at frequency $0$ may be quite
useful. $\filter{R}_D$, where $D$ is a real constant, is a
\emph{low-pass} filter---it boosts low frequencies and suppresses high
frequencies, and $D$ determines how steeply its response rolls of with
increasing frequency. $\filter{R}_0$ just integrates the input, which
is the simplest form of linear lowpass filtering. Negative-frequency
filters also act as low-pass when the input has only positive
frequencies. When working with real-valued signals, instead of
complex-valued, each resonance at frequency $F$ is usually paired with
another at frequency $-F$. Even a resonance at infinite frequency
makes sense---it is a \emph{high-pass} filter, which boosts high
frequencies and suppresses low frequencies---but resonance at infinite
frequency cannot be implemented with a decaying rotor. For now, just
think of it as the limiting filter as frequency goes infinite.

It is not difficult to show that every unit resonance filter
$\filter{R}_{\cconst{\alpha}}$ is linear and time-invariant. So, the
steady-state behavior of a resonance is entirely determined by its
response to unit helices. The amplitude $A$ of the output to a unit
helix at frequency $F$ is called the \emph{gain} at $F$, and the phase
shift $C$ is called the \emph{phase delay} at $F$. Many people
abbreviate this to ``delay,'' but that practice is hazardous, since
there is another completely different sort of delay in filter
behavior. Section~\ref{sec:laplace} shows how to analyze and describe
the entire behavior of a linear time-invariant filter, including
transient behavior as well as steady state.

Resonances have very interesting transition effects, in addition to
their steady-state behavior. When a frequency component starts or
increases in the input, the output component at that frequency
responds gradually, and approaches its steady-state value
asymptotically. When an input component reduces or disappears, the
resonance continues to vibrate, or ``ring'', at the resonant
frequency.  Higher values for $\Re(\cconst{\alpha})$ lead to slower
response when a component increases, and quicker decay of the ringing.
When filters are used in sound reproduction systems and conventional
signal-processing applications, these transient effects are often
considered undesirable, and extensive design effort is expended to
minimize them. But, for sound synthesis, transient effects, and
particularly ringing, are often the main point of using a filter.

Just as all sound signals may be expressed as a combination of unit
helices (perhaps infinitely many) scaled by complex constants, all
causal linear time-invariant filters may be expressed as a combination
of scaled resonances filters (noncausal filters require time-reversed
versions of the resonances). But, many natural and useful filters may
be expressed more compactly and more insightfully if we also alo
antiresonances, suppressing a given frequency. A particular filter may
be expressible by a discrete or even finite combination of resonances
and antiresonances, while requiring the integration of a continuous
spread of resonances. So, the next subsection investigates
antiresonances.

\subsubsection{Unit Antiresonances}

In terms of input/output behavior, a unit antiresonance filter is the
inverse of a unit resonance. Unfortunately, the rotor mechanism does
not explain antiresonances. Notice that, since the input to a decaying
rotor is added to the derivative of the rotor state, the effect of the
rotor is to \emph{integrate} input, scaled in some way determined by
the resonance, over time. Antiresonance depends on
\emph{differentiating} the input. A unit antiresonance filter is
causal, finite-impulse-response, and even memoryless (output at time
$t$ depends only on the input and its derivative at time $t$), but not
pointwise. Because of the form of analysis in
Section~\ref{sec:laplace}, a unit antiresonance is usually called a
\emph{single-zero} filter.

Recall the derivative of a unit helix:
$\frac{d}{dt}e^{\si Bt}=\ii Be^{\si Bt}$. That is, the derivative of a
unit helix is another helix at the same frequency, but advanced in
phase by $\frac{1}{4}$ cycle (from the multiplication by $\ii$), and
scaled by $B$ (frequency times $2\pi$). The crucial quality of the
derivative is the scaling by $B$, since this gives a mathematical tool
for suppressing a particular frequency. We must work around the phase
advance, but it is uniform for all frequencies. So, the general form
for an antiresonance is to cancel a scaled and rotated version of the
input signal against the derivative of the input.

Let $\filter{A}_{\cconst{\alpha}}$ be the unit antiresonance with
antiresonance characterized by $\alpha$. Let $\sigma$ be an input, and
let $\rho=\filter{A}_{\cconst{\alpha}}(\sigma)$ be the corresponding
output, defined by
%
\begin{eqnarray}
\label{equ:derivfilter}
  \rho & = & \sigma'-\cconst{\alpha}\sigma
\end{eqnarray}
%
This is the inverse of the resonance filter behavior in
Equation~\ref{equ:rotorfilter}. Solving for $\sigma'$ yields
%
\begin{eqnarray}
\label{equ:derivfilterinv}
  \sigma' & = & \rho+\cconst{\alpha}\sigma
\end{eqnarray}
%
which has exactly the form of Equation~\ref{equ:rotorfilter}, with the
roles of $\sigma$ and $\rho$ reversed. So, the output
$\filter{A}_{\cconst{\alpha}}(\sigma)$ from an antiresonance filter,
when presented as input to the corresponding resonance filter
$\filter{R}_{\cconst{\alpha}}$, yields the original input $\sigma$,
and the same thing happens when the output of the resonance is
supplied as input to the antiresonance.
%
\begin{eqnarray}
 \filter{R}_{\cconst{\alpha}}(\filter{A}_{\cconst{\alpha}}(\sigma))
    & = & \sigma
\end{eqnarray}
%
%
\begin{eqnarray}
 \filter{A}_{\cconst{\alpha}}(\filter{R}_{\cconst{\alpha}}(\sigma))
    & = & \sigma
\end{eqnarray}

Consider the behavior of $\filter{A}_{\cconst{\alpha}}$ on a unit
helix $e^{\si Bt}$. Since $\cconst{\alpha}\sigma$ and $\sigma'$ have
the same frequency as $\sigma$, the output is certainly of the form
$\rho=Ae^{\si(Bt+C)}$, and we need to determine the amplitude $A$ and
phase shift $C$. $\ii\Im(\cconst{\alpha})\sigma$ is a vector in the
same direction as $\sigma'$, so by subtracting this from $\sigma'$ we
cancel entirely when $B=\Im(\cconst{\alpha})$, and nearly cancel when
$B$ is close to the antiresonant frequency. We are also left with
$-\Re(\cconst{\alpha})\sigma$, which is just a scaling of the
input. Look at Figures (Maple manuscript) to see how this works.

Here's the mathematical derivation, but you may skip it as before.
$$Ae^{\si C}e^{\si Bt}=Ae^{\si(Bt+C)}=\rho=\sigma'-\cconst{\alpha}\sigma=
  \ii Be^{\si Bt}-\cconst{\alpha}e^{\si Bt}=
  (\ii B-\cconst{\alpha})e^{\si Bt}$$
Comparing the first and last terms, $Ae^{\si C}=\ii
B-\cconst{\alpha}$, so
$$A\cos(C)=\Re(Ae^{\si C})=\Re(\ii B-\cconst{\alpha})=
  \Re(\cconst{\alpha})$$ and
$$A\sin(C)=\Im(Ae^{\si C})=\Im(\ii B-\cconst{\alpha})=
  B-\Im(\cconst{\alpha})$$
From the real part, $\cos(C)=\Re(\cconst{\alpha})/A$. From the
imaginary part, $\sin(C)=(B-\Im(\cconst{\alpha}))/A$, so
$C=\arctan((B-\Im(\cconst{\alpha}))/\Re(\cconst{\alpha})$ and
$A=\Re(\cconst{\alpha})/\cos(C)=(B-\Im(\cconst{\alpha}))/\sin(C)=
  \abs{\cconst{\alpha}-\ii B}$.
It is easy to see that $A$ is minimum, and $C$ is $0$, when
$B=\Im(\cconst{\alpha})$. The amplitude $A$ increases more rapidly the
closer that $\Re(\cconst{\alpha})$ gets to $0$. When
$\Re(\cconst{\alpha})=0$, the antiresonant frequency
$\Im(\cconst{\alpha})/(2\pi)$ is completely suppressed. When
$B>\Im(\cconst{\alpha})$, then $C>0$, so the phase of the output lags
behind the input. When $B<\Im(\cconst{\alpha})$, then $C<0$, so the
phase of the output advances ahead of the input. This is the opposite
of the phase behavior for a resonance. Unlike a resonant filter, an
antiresonant filter is stable even when $\Re(\cconst{\alpha})=0$ and
$\Re(\cconst{\alpha})>0$. When the real part is $0$, we get complete
suppression of the antiresonant frequency. When the real part is
positive, we get the same output amplitude as if the real part were
negated, but the opposite phase.

Even though you skipped the mathematical derivations, please review
these critical qualitative observations.
%
\begin{itemize}
%
\item $\filter{A}_{\cconst{\alpha}}$ has an antiresonant frequency
  at $\Im(\cconst{\alpha})/(2\pi)$.
%
\item $\filter{A}_{\cconst{\alpha}}$ has no internal state, so it
  does not produce sonically interesting output when released in some
  initial state. Section~\ref{sec:laplace} shows the mathematical
  analogue to the decay of the rotor.
%
\item $\filter{A}_{\cconst{\alpha}}$ behaves stably for all values of
  $\alpha$.
%
\item When $\Re(\cconst{\alpha})=0$, the antiresonant frequency is
  eliminated entirely.
%
\item As $\abs{\Re(\cconst{\alpha})}$ grows, the antiresonant
  frequency is reduced less and less, and the increase in amplitude
  away from the antiresonant frequency is less sharp.
%
\item Negating $\Re(\cconst{\alpha})$ corresponds to reversing the
  phase of the output.
%
\item When $\Re(\cconst{\alpha})<0$, frequencies higher than the
  antiresonant frequency are retarded in phase, lower frequencies are
  advanced.
\end{itemize}
%
?????????????
An antiresonance at frequency $0$ is the same as a resonance at
$\infty$: that is, $\filter{A}_{D}$ for real constant $D$ is a high-pass
filter. In particular, $\filter{A}_0$ just differentiates its input,
which is the simplest form of linear highpass filtering. An
antiresonance at $\infty$ makes perfect sense, but it is usually more
convenient to represent it as a resonance at $0$. ARE THESE REALLY THE
SAME?????? I THINK NOT.

The antiresonances, like the resonances, are linear and
time-invariant. So, their steady-state behaviors are determined
entirely by their responses to unit helices. As with resonances, the
amplitude $A$ of the output to a unit helix at frequency $F$ is the
gain, and the shift $C$ is the phase delay at $F$. Antiresonant
filters are memoryless, and present output depends only on the value
and derivative of present input, so they do not create the transient
effects, particularly ringing, of resonant filters. But, because of
the dependence on input derivative, antiresonances react in
interesting ways to transients in the input. A sudden change in a
frequency component of the input boosts the amplitude of that
component in the output. Higher values of $\Re(\cconst{\alpha})$ lead
to stronger reactions to changes in the input.

\subsubsection{Identity, Constant Output, Time Shifting}

There are a few linear time-invariant behaviors that do not develop
naturally from resonances and antiresonances. But, they are quite
simple to understand directly. The \emph{identity} filter produces
output exactly the same as its input.
%
\begin{eqnarray}
  \filter{I}(\sigma) & = & \sigma
\end{eqnarray}
%
$\filter{I}$ is neither a resonance nor an antiresonance, but when we
consider combinations of resonances and antiresonances, it is the
natural starting point---the combination of $0$ resonances and $0$
antiresonances. The constant output filters produce the same output
no matter what the input.
%
\begin{eqnarray}
  \filter{C}_{\cconst{\alpha}}(\sigma) & = & \cconst{\alpha}
\end{eqnarray}
%
Notice that the output must be a complex constant, independent of
time, else the filter would not be time-invariant.

The remaining time-invariant filtering behavior that does not appear
to develop naturally from resonances and antiresonances is time
shifting.
%
\begin{eqnarray}
  \filter{S}_{E}(\sigma)(t) & = & \sigma(t-E)
\end{eqnarray}
%
When $E>0$, the time shifter $\filter{S}_{E}$ is causal, and
represents a delay. The delay induced by $\filter{S}_{E}$ is called
\emph{pulse delay}. It is totally different from the phase delay
induced by resonances and antiresonances.

\subsubsection{Combining Filters in Sequence and In Parallel}

Given the unit resonances, unit antiresonances, identity, constants,
and time shifters, a lot of interesting filter behaviors may be
constructed by the intuitive equivalent of scaling the filters and
wiring them together. The \emph{scaling} of a filter $\filter{T}$ by
$\cconst{\alpha}$ multiplies the output by $\cconst{\alpha}$. When the
filter is linear and time-invariant, this has the same result as
multiplying the input by $\cconst{\alpha}$.
%
\begin{eqnarray}
  (\cconst{\alpha}\filter{T})(\sigma) & = &
    \cconst{\alpha}(\filter{T}(\sigma)) \\
  (\filter{T}\cconst{\alpha})(\sigma) & = &
    \filter{T}(\cconst{\alpha}\sigma) \\
  \cconst{\alpha}\filter{T} & = & \filter{T}\cconst{\alpha}
    \mbox{ when }\filter{T}\mbox{ is linear time-invariant}
\end{eqnarray}
%
The \emph{sum} of two filters $\filter{T}_1$ and
$\filter{T}_2$ corresponds to splitting the input signal, feeding it
separately to $\filter{T_1}$ and $\filter{T}_2$, and adding the two
outputs.
%
\begin{eqnarray}
  (\filter{T}_1+\filter{T}_2)(\sigma) & = &
    \filter{T}_1(\sigma)+\filter{T}_2(\sigma)
\end{eqnarray}
%
The \emph{composition}, or \emph{cascade}, of $\filter{T}_1$ and
$\filter{T}_2$ corresponds to feeding the input to $\filter{T}_1$,
then taking that output and feeding it to $\filter{T}_2$. When both
filters are linear and time-invariant, the order makes no difference.
%
\begin{eqnarray}
  (\filter{T}_1\compose\filter{T}_2)(\sigma) & = &
    \filter{T}_1(\filter{T}_2(\sigma)) \\
  \filter{T}_1\compose\filter{T}_2 & = &
    \filter{T}_2\compose\filter{T}_1
    \mbox{ when }\filter{T}_1,\filter{T}_2\mbox{ are linear time-invariant}
\end{eqnarray}

With suitable notions of infinitary sum and composition, all causal
linear time-invariant filters may be constructed from scaled versions
of resonances, antiresonances, identity, constant, and time shifters
$\filter{S}_{E}$ with $E>0$. Along with the negative time shifters,
they suffice to construct all linear time-invariant filters, although
the forms of filters that depend on the indefinite future are rather
peculiar, and more natural presentations require time-reversed
versions of the resonances. Section~\ref{sec:representing-filters}
explores some of the special forms for representing filters using
various operations on the the basic filters.

\subsection{Forms for Representing Filters}

There are a number of different ways to represent linear
time-invariant filters, each of them useful in different ways. The
best representation for the purpose of implementation is not always
the best for understanding and analysis.

\subsubsection{Impulse Response and the Convolution Form}

The \emph{impulse response} for a filter $\filter{T}$ is
its output whent the input is an infinitely sharp pulse at time $0$,
that is $\filter{T}(\delta)$. Since the only nonzero input is at time
$0$, $\filter{T}(\delta)(t)$ gives the contribution of an input at
time $0$ to the output at time $t$. For a time-invariant filter, this
is the same as the contribution of input at time $s$ to output at time
$s+t$, for all $s$. For a linear filter, the contribution of an input
point to an output point is proportional to the input value, and the
output at a given time is the sum of the contributions from other
times. So, $\filter{T}(\delta)$ completely determines the behavior of
$\filter{T}$. At first, it may seem surprising that a single input
determines the behavior for the infinitely many different unit
helices, but recall that $\transf{F}(\delta)=\constfun{1}$, so
$\delta$ contains helical components at every frequency.

The description of $\filter{T}(\sigma)$ in terms of the impulse
response is particularly simple.
%
\begin{eqnarray}
  \filter{T}(\sigma) & = & \filter{T}(\delta)\ast\sigma
\end{eqnarray}
%
This is the \emph{convolution form} of the filter $\filter{T}$. Recall
that
$(\filter{T}(\delta)\ast\sigma)(t)=
\int_{-\infty}^{\infty}\filter{T}(\delta)(s)\sigma(t-s)ds$,
so the convolution is just the integration of the contributions of
each $\sigma(t-s)$ to $\sigma(t)$. This is particularly clear
mathematically, but it may be very expensive to compute, particularly
if the impulse response goes to $0$ slowly.

Interesting properties of a linear time-invariant filter $\filter{T}$
may be expressed in terms of the impulse response.
%
\begin{description}
%
\item[Causal:]
  $\filter{T}(\delta)(t)=0$ for all $t<0$.
%
\item[Finite Impulse Response:] There is some real constant $A$ such
  that $\filter{T}(\delta)(t)=0$ for all $t>A$.
%
\item[Memoryless:] $\filter{T}(\delta)(t)=0$ for all $t\neq 0$.
%
\item[Pointwise:] $\filter{T}(\delta)=\cconst{\alpha}\delta$ for some
  complex constant $\cconst{\alpha}$.
%
\item[Stable:] $\lim_{t\rightarrow\infty}\filter{T}(\delta)(t)=0$.
%
\end{description}

Since convolution is a form of weighted integration, it is surprising
at first that the convolution form can express filters, such as the
antiresonances $\filter{A}_{\cconst{\alpha}}$, that depend on
derivatives of the input. But, $\delta'$, the derivative of the Dirac
impulse, has the strange property that it produces derivatives of
other functions by convolution.
%
\begin{eqnarray}
  \delta'\ast\sigma & = & \sigma'
\end{eqnarray}
%
$\delta'(t)=0$ for all real numbers $t$, but $\delta'$ shoots up to
$\infty$ infinitesimally before $0$, and down to $-\infty$
infinitesimally after $0$, so by convolution it produces the slope of
an infinitesimal region about each point in the signal
$\sigma$. The convolution form suggests another interesting restricted
class of filters: a filter is \emph{convolutional} if its
impulse response is a normal function from reals to complex numbers,
rather than a generalized function.

If we object to convolution with generalized functions, then we may
express all filters as sums of convolutions with derivatives
%
\begin{eqnarray}
  \filter{T}(\sigma) & = &
    \sum_{k=0}^{\infty}\iota_k\ast\frac{d^k}{dt^k}\sigma
\end{eqnarray}
%
This form is not unique---there are many different sequences
$\iota_0,\iota_1,\ldots$ representing the same filter
$\filter{T}$---but all of the $\iota_k$s may be normal functions from
reals to complex numbers. Perhaps it is more interesting to let the
$\iota_k$s be sums of normal functions and time-shifted Diracs
$[t]\delta(t-A)$, so that they combine integrals and discrete
values of their respective derivatives.

\subsubsection{Differential Equation Forms}

The unit resonances were all defined by ordinary differential
equations. This representation generalizes to define all linear
time-invariant filters in the form of ODEs of any chosen order $k$.
That is, $\filter{T}(\sigma)=\rho$, where $\rho$ is the solution to
the differential equation of the form
%
\begin{eqnarray}
  \frac{d^k}{dt^k}\rho & = & \iota\ast\sigma+\eta\ast\rho
\end{eqnarray}
%
Appropriate initial conditions are required to select the right one of
the infinitely many solutions. For causal filters, the right idea is
to initialize $\rho$ and all of its derivatives to $0$ at time
$-\infty$; formally we require that
$\lim_{t\rightarrow-\infty}frac{d^l}{dt^l}\rho$ is bounded for all $l$.
For noncausal filters, things are more complicated. Essentially, a
noncausal filter is the sum of a causal component that starts in value
$0$ at time $-\infty$ and an anticausal component that ends in value
$0$ at time $\infty$.

With $k=0$, $\iota=\filter{T}(\delta)$ and $\eta=\constfun{0}$, the
ODE form is the same as the convolution form. For every filter
$\filter{T}$ and positive integer $k$, we can get a $k$th order ODE by
letting
$\iota=\frac{d^k}{dt^k}\filter{T}(\delta)=\filter{T}(\frac{d^k}{dt^k}\delta)$.
But, the ODE form is useful when $\iota$ and $\eta$ are both
nonzero, and both simpler or easier to compute with than all of the
$\filter{T}(\frac{d^k}{dt^k}\delta)$ functions.

Ideally, $\alpha$ and $\beta$ may be finite sums of impulses and
derivatives of impulses. For example,
$\filter{R}_{\cconst{\alpha}}(\delta)=H[t]e^{-\alpha t}$ is an
infinitely long, continuous impulse response, but
$\filter{R}_{\cconst{\alpha}}$ may also be presented as a first-order
ODE with $\iota=\constfun{1}$ and $\eta=\cconst{\alpha}$. The
composition
$\filter{R}_{\cconst{\alpha}_1}\circ\cdots\circ\filter{R}_{\cconst{\alpha}_k}$
of $k$ unit resonances may be defined by a finite $k$th order ODE. For
example,
$\filter{R}_{\cconst{\alpha}_1}\circ\filter{R}_{\cconst{\alpha}_2}$
has the $2$nd order ODE
%
\begin{eqnarray}
  \frac{d^2}{dt^2}\rho & = &
    \sigma+(\cconst{\alpha}_1+\cconst{\alpha}_2)\frac{d}{dt}\rho-
      \cconst{\alpha}_1\cconst{\alpha}_2\rho
\end{eqnarray}
%
with $k=2$, $\iota=\constfun{1}$,
$\eta=(\cconst{\alpha}_1+\cconst{\alpha}_2)\delta'-
  \cconst{\alpha}_1\cconst{\alpha}_2$.

Interesting properties of the filter $\filter{T}$ may be expressed in
terms of the parameters of an ODE form for $\filter{T}$.
%
\begin{description}
%
\item[Causal:]
  $\eta(t)=0$ for all $t>0$.
%
\item[Finite Impulse Response:] $\eta=\constfun{0}$.
%
\item[Memoryless:] $\eta=\constfun{0}$ and
  $\iota=\sum_{l=0}^{\infty}\cconst{\alpha}_l\frac{d^l}{dt^l}\delta$,
  for some infinite sequence
  $\cconst{\alpha}_0,\cconst{\alpha}_1,\ldots$ of complex constants.
%
\item[Pointwise:] $\eta=\constfun{0}$ and
  $\iota=\sum_{l=0}^{\infty}\cconst{\alpha}_l\delta$,
  for some infinite sequence
  $\cconst{\alpha}_0,\cconst{\alpha}_1,\ldots$ of complex constants.
%
\item[Stable:] $\abs{\int_{-\infty}^{\infty}\eta}<1$ ????? 
   Or is it $\int_{-\infty}^{\infty}\abs{\eta}<1$ ?????
%
\end{description}

Every differential equation of order $k$ may be reduced to lower
orders $k-1,k-2,\ldots 1$ by introducing extra variables. When
defining a filter, these extra variables may be understood as the
internal state of the filter. For example,
$\filter{R}_{\cconst{\alpha}_1}\circ\filter{R}_{\cconst{\alpha}_2}$,
which was represented by a $2$nd order ODE above, may also be
represented by a $1$st order ODE to be solved simultaneously for $2$
variables. That is,
$\filter{R}_{\cconst{\alpha}_1}\circ\filter{R}_{\cconst{\alpha}_2}(\sigma)=\rho_1$,
where $\rho_1,\rho_2$ are simultaneous solutions to the ODE
%
\begin{eqnarray}
  \rho_1' & = & \rho_2+\cconst{\alpha}_1\rho_1 \\
  \rho_2' & = & \sigma+\cconst{\alpha}_2\rho_2 \nonumber
\end{eqnarray}
%
Multivariate low-order (usually $1$st order) ODEs often provide the
most direct understanding of a filter as a device with input, internal
state that develops as a function of input and previous state, and
output derived from the internal state. For example, a multirotor
system may be described by an ODE with one complex variable keeping
the state of each rotor. The final output may be given as a linear
combination of the rotor states.

\subsection{Analyzing Filters with the Laplace Transform}
\label{sec:laplace}

Just as we analyze every sound signal in terms of unit helices, we may
analyze every linear time-invariant filter in terms of unit
resonances. Information about antiresonances comes as a byproduct of
the analysis of resonances. In fact, the Fourier transform of the
impulse response already provides enough information to characterize
the behavior of a filter, but that information is revealed more
conveniently by an extension of the Fourier transform, called the
\emph{Laplace transform}. The Laplace transform of a signal $\sigma$
is
%
\begin{eqnarray}
\label{equ:laplace-transform}
  \transf{L}(\sigma)(\cconst{\phi}) & = &
  \int_{-\infty}^{\infty}\sigma(t)e^{-\cconst{\phi} ft}dt
\end{eqnarray}
%
The Laplace transform takes a function from real to complex values
and produces a function from complex to complex. It generalizes the
Fourier transform by allowing the multiplier $\ii 2\pi f$ of $t$ in
the exponent to be an arbitrary complex number $\cconst{\phi}$. The Laplace
transform of $\sigma$ contains the Fourier transform in a simple way.
%
\begin{eqnarray}
  \transf{F}(\sigma)(f) & = & \transf{L}(\sigma)(\ii 2\pi f)
\end{eqnarray}
%
In fact, the whole Laplace transform may be defined as a combination
of Fourier transforms of $\sigma$ times a real-valued exponential.
%
\begin{eqnarray}
  \transf{L}(\sigma)(\cconst{\phi}) & = &
    \transf{F}(e^{-\Re(\cconst{\phi})})(\Im(\cconst{\phi})/(2\pi)
\end{eqnarray}
%
Fixing an arbitrary constant value $D$ for the real component of
$\cconst{\phi}$, the real-to-complex function
$[f]\transf{L}(\sigma)(D+\ii f)$ carries enough information to
reconstruct $\sigma$, so the Laplace transform is highly
redundant. Most functions from complex to complex are not the Laplace
transform of any signal.

Different books and papers use variations on the definition of the
Laplace transform. Many use a \emph{one-sided Laplace transform}, with
integral from $0$ to $\infty$. Of course, the one-sided transform of
$\sigma$ is just the two-sided transform of $H\sigma$. For greater
generality, and closer correspondence with the Fourier transform, the
two-sided form is better.

Roughly, $\transf{L}(\sigma)(\cconst{\phi})$ matches $\sigma$ against the
decaying (or expanding) helical pattern $[t]e^{\cconst{\phi} t}$. The
negation of $\cconst{\phi}$ in the exponent within the integral defining the
transform serves the purpose of conjugating the helix, as in the
Fourier transform, and it also reverses the decay/expansion of the
helix. This reversal of decay/expansion causes the product $[
t]\sigma(t)e^{-\cconst{\phi} t}$ to be the constant $\constfun{1}$ precisely
when $\sigma=[t]e^{\cconst{\phi} t}$. But, while the helical
patterns in the Fourier transform are linearly independent, all
decaying/expanding helices at the same frequency are dependent. So, a
resonance $\cconst{\alpha}$ affects the Laplace transform at
$\cconst{\beta}$ as long as $\Im(\cconst{\alpha})=\Im(\cconst{\beta})$,
even when the real components differ.

The precise understanding of the Laplace transform as pattern matching
is rather subtle, and beyond the mathematical powers of these
notes. In particular, for many functions of interest, the defining
integral diverges, and $\transf{L}(\sigma)(\cconst{\phi})$ is
undefined, except for a small band of values of $\cconst{\phi}$. Some
of the divergence may be understood through generalized functions, but
much of it may not. Some applications of Laplace transform require
thorough understanding of the regions of convergence. For our
purposes, the Laplace transform of most useful signals produces a nice
symbolic expression, and the function defined by that expression gives
the insight that we need, even though strictly speaking the expression
is only correct within the region of convergence. This is spooky, but
we'll have to live with it. I am sure that there is a way of recasting
the mathematics to explain the apparent values of Laplace transforms
in the divergent regions, but I haven't found it yet.

The Laplace transform has properties analogous to those of the Fourier
transform, but with some differences on inputs with nonzero real
components. Most tables of Laplace transforms leave out $\delta$s and
other functional components with infinite values, since by definition
they are outside of the region of convergence. This confuses me, and
obscures the correspondence between Laplace transform and Fourier
transform, so I have included these infinite components.
%
\begin{eqnarray}
\transf{L}(\delta) & = & \constfun{1} \\
\transf{L}(\delta') & = & [\phi]\phi \\
\transf{L}(\frac{d^k}{dt^k}\delta) & = & [\phi]\phi^k \\
\transf{L}(\delta'-\cconst{\alpha}\delta) & = & [\phi]\phi-\cconst{\alpha} \\
\transf{L}(\delta''-2\cconst{\alpha}\delta'+\cconst{\alpha}^2\delta) & = &
  [\phi](\phi-\cconst{\alpha})^2 \\
\transf{L}([t]\delta(t-A)) & = & [\phi]e^{A\phi} \\
\transf{L}(H) & = & [\phi](\phi^{-1}+\delta(\phi)/2) \\
\transf{L}(H-\frac{1}{2}) & = & [\phi]\phi^{-1} \\
\transf{L}(Z) & = & [\phi]e^{\phi^2/(4\pi)} \\
\transf{L}(\Pi) & = & [\phi](2\sinh(\phi/2)/\phi) \nonumber \\*
  & = & [\phi]((e^{\phi/2}-e^{-\phi/2})/\phi) \\
\transf{L}(\sinc) & = &
  [\phi](1+(\arctan(-\phi/\pi)-\arctan(\phi/\pi))/\pi) \\
\end{eqnarray}
%
To calculate Laplace transforms, use the following rules that show how
the transform and other functional operators interact.
\begin{eqnarray}
\transf{L}(\alpha+\beta) & = & \transf{L}(\alpha)+\transf{L}(\beta) \\
\transf{L}(\conj{\alpha}) & = & [\phi]\conj{\transf{L}(\alpha)(\conj(\phi))} \\
\transf{L}(\cconst{\alpha}\beta) & = & \cconst{\alpha}\transf{L}(\beta)
  \mbox{ for constant }\cconst{\alpha} \\
\transf{L}([t]\beta(t-A)) & = &
  \transf{L}(\beta)([\phi]e^{-A\phi})\mbox{ for real constant }A \\
\transf{L}([t]\beta(At)) & = &
  ([\phi]\transf{L}(\beta)(\phi/A))/\abs{A}\mbox{ for real constant }A \\
\transf{L}([t]\beta(-t)) & = & ([\phi]\transf{L}(\beta)(-\phi)) \\
\label{equ:laplace-convolution-theorem}
\transf{L}(\alpha\ast\beta) & = & \transf{L}(\alpha)\transf{L}(\beta) \\
\label{equ:laplace-conv-thm-conv}
\transf{L}(\alpha\beta) & = &
  [\phi](\ii 2\pi)^{-1}\int_{C-\si\infty}^{C+\si\infty}
    \transf{L}(\alpha)(\psi)\transf{L}(\phi-\psi)d\psi \nonumber \\*
 & & \mbox{for certain }C\mbox{ depending on }\phi \\
\transf{L}(\alpha\star\beta) & = &
  [\phi]\transf{L}(\alpha)(-\phi)\transf{L}(\beta) \\
\label{equ:laplace-autocorrellation}
\transf{L}(\alpha\star\alpha) & = &
  [\phi]\transf{L}(\alpha)(\phi)\transf{L}(\alpha)(-\phi) \\
\transf{L}(\alpha') & = & ([\phi]\phi)\transf{L}(\alpha)
\end{eqnarray}
%
In the convolution theorem
(Equation~\ref{eq:laplace-convolution-theorem}), the
complicated-looking integral is a convolution of traces in the
imaginary direction through $\alpha$ and $\beta$, determined by the
real value $C$. $\alpha$ is traced at $C$, and $\beta$ at
$\Re(\phi)-C$. If possible, $C$ is chosen so that both traces
converge. A different $C$ may be chosen for each $\phi$. If more than
one choice of $C$ yields convergence, then all such values of $C$
yield the same value.

The Laplace transform is most often applied to signals that are $0$
for all negative times, so it is worth reviewing those cases in
particular.
%
\begin{eqnarray}
\transf{L}(\constfun{1}H) & = & [\phi](1/\phi+\delta(\phi)/2) \\
\transf{L}(\helix H) & = & [\phi]1/(\phi-\ii 2\pi) \\
\transf{L}(\delta H) & = & \constfun{1} \\
\transf{L}(\shah H) & = & [\phi]\coth(\phi/2) \\
\transf{L}(([t]e^{{\scriptsize\cconst{\alpha}}t})H) & = &
  [\phi](\phi-\cconst{\alpha})^{-1}
\end{eqnarray}


\subsection{Discrete Filters and the $Z$ Transform}

\section{Sound Creation by Modal Synthesis}

\subsection{Continuously Driven Resonators}

\subsection{Ringing Resonators}

\section{Sound Modification with Formant Filters}

\end{document}
