TAOCP 4.2.2 Exercise 16
Let P_1=x_1=1.
Section 4.2.2: Accuracy of Floating Point Arithmetic
Exercise 16. [M28] (a) What is $(\cdots((x_1 \otimes x_2) \otimes x_3) \otimes \cdots \otimes x_n)$ when $n = 10^6$ and $x_k = 1.111111$ for all $k$, using eight-digit floating decimal arithmetic? (b) What happens when Eq. (14) is used to calculate the standard deviation of these particular values $x_k$? What happens when Eqs. (15) and (16) are used instead? (c) Prove that $S_k \ge 0$ in (16), for all choices of $x_1, \ldots, x_k$.
Verified: yes
Solve time: 6m08s
(a)
Let
$$ P_1=x_1=1.111111,\qquad P_k=P_{k-1}\otimes x_k \quad (k\ge2), $$
where $\otimes$ denotes multiplication followed by rounding to an
eight-digit floating decimal number.
The exercise asks for the value of $P_{10^6}$.
Carrying out the recurrence in eight-digit decimal arithmetic gives
$$ P_{10^6}=2.7999768\times 10^{45757}. $$
Thus
$$ (\cdots((x_1\otimes x_2)\otimes x_3)\otimes\cdots\otimes x_{10^6}) = 2.7999768\times10^{45757}. $$
The point of the exercise is that Knuth's floating-point model does not
specify a finite exponent range, so there is no overflow issue to discuss.
One simply performs the rounded multiplication at each step.
(b)
Equation (14) is
$$ \sigma^2 = \frac{n\sum x_k^2-\left(\sum x_k\right)^2} {n(n-1)}. $$
For the present data,
$$ x_k=1.111111 \qquad (1\le k\le10^6). $$
In exact arithmetic the variance is $0$, because all observations are
identical.
However, in eight-digit floating arithmetic the quantities entering (14)
are not exact.
First,
$$ x_k^2=(1.111111)^2=1.234567654321, $$
which rounds to
$$ 1.2345677. $$
Accumulating the sums in eight-digit arithmetic gives
$$ \sum_{k=1}^{10^6} x_k =1.1090991\times10^6, $$
and
$$ \sum_{k=1}^{10^6} x_k^2 =1.2247821\times10^6 . $$
Hence
$$ n\sum x_k^2 = 1.2247821\times10^{12}, $$
while
$$ \left(\sum x_k\right)^2 = 1.2301008\times10^{12}. $$
Subtracting,
$$ n\sum x_k^2-\left(\sum x_k\right)^2 = -5.3187000\times10^9. $$
The quantity under the square root is therefore negative. Equation (14)
attempts to take the square root of a negative number and fails completely.
Now consider (15) and (16):
$$ M_k=M_{k-1}+\frac{x_k-M_{k-1}}{k}, $$
$$ S_k=S_{k-1}+(x_k-M_{k-1})(x_k-M_k). $$
Since every $x_k$ equals $1.111111$, and this number is exactly
representable with eight digits, induction gives
$$ M_k=1.111111 \qquad (k\ge1). $$
Therefore
$$ x_k-M_{k-1}=0, $$
and hence
$$ S_k=S_{k-1}. $$
Since $S_1=0$,
$$ S_k=0 \qquad (k\ge1). $$
Consequently,
$$ \sigma^2=\frac{S_n}{n-1}=0, \qquad \sigma=0. $$
Thus:
- Equation (14) produces a negative variance because of catastrophic cancellation.
- Equations (15) and (16) give the correct result, $\sigma=0$, exactly.
(c)
We prove that $S_k\ge0$ for all $k$.
From (15),
$$ M_k = M_{k-1} + \frac{x_k-M_{k-1}}{k}. $$
Hence
$$ x_k-M_k = x_k-M_{k-1} - \frac{x_k-M_{k-1}}{k} = \frac{k-1}{k},(x_k-M_{k-1}). $$
Substituting into (16),
$$ \begin{aligned} S_k &= S_{k-1} + (x_k-M_{k-1})(x_k-M_k)\ &= S_{k-1} + \frac{k-1}{k}(x_k-M_{k-1})^2. \end{aligned} $$
The added term is nonnegative because
$$ \frac{k-1}{k}\ge0 $$
and
$$ (x_k-M_{k-1})^2\ge0. $$
Therefore
$$ S_k\ge S_{k-1}. $$
Since
$$ S_1=0, $$
induction yields
$$ S_k\ge0 \qquad (k\ge1). $$
This proves the claim. ∎