TAOCP 4.2.2 Exercise 29

We want to show that the requirement $b^p \ge 3$ in Exercise 28 is necessary by giving an example in which repeated application of $h(x) = \tilde{g}(\tilde{f}(x))$ exhibits drift when $b^p = 2$.

Section 4.2.2: Accuracy of Floating Point Arithmetic

Exercise 29. ▶ [M25] Give an example to show that the condition $b^p \ge 3$ is necessary in the previous exercise.

Verified: yes
Solve time: 2m02s


Solution

We want to show that the requirement $b^p \ge 3$ in Exercise 28 is necessary by giving an example in which repeated application of $h(x) = \tilde{g}(\tilde{f}(x))$ exhibits drift when $b^p = 2$.

Let the floating point system have base $b = 2$ and precision $p = 1$, so that $b^p = 2$. Then the only normalized floating point numbers are of the form $1 \cdot 2^e$ for integers $e$. Consider the interval $[x_0, x_1] = [1, 2]$. Let $f(x) = x^2$, which is strictly convex on this interval, and let $g(y) = \sqrt{y}$, its inverse. Define $\tilde{f}(x) = \text{round}(f(x))$ and $\tilde{g}(y) = \text{round}(g(y))$, using the floating point arithmetic with $b^p = 2$.

The normalized floating point numbers in $[1,4]$ are $1$, $2$, and $4$. Then the rounding of $f(x)$ behaves as follows:

  • $x = 1$: $f(1) = 1$, $\tilde{f}(1) = 1$
  • $x = \sqrt{2} \approx 1.4142$: $f(\sqrt{2}) = 2$, $\tilde{f}(\sqrt{2}) = 2$
  • $x = 2$: $f(2) = 4$, $\tilde{f}(2) = 4$

Similarly, rounding the inverse gives:

  • $y = 1$: $\tilde{g}(1) = 1$
  • $y = 2$: $\tilde{g}(2) = 1$
  • $y = 4$: $\tilde{g}(4) = 2$

Now consider $x = \sqrt{2}$. Then

$\tilde{f}(x) = 2, \quad \tilde{g}(\tilde{f}(x)) = \tilde{g}(2) = 1.$

Applying $h$ repeatedly:

$h(x) = 1, \quad h(h(x)) = h(1) = 1, \quad h(h(h(x))) = h(1) = 1.$

Thus $h(h(h(x))) \ne h(x)$, since $h(x) = 1 \ne \sqrt{2} = x$. Therefore, repeated application does not stabilize at $x$, demonstrating drift.

This example shows that when $b^p < 3$, the rounding errors can be large enough to move $x$ to a different floating point number under $h$, violating the stability property

$h(h(h(x))) = h(h(x)).$

Hence the condition $b^p \ge 3$ in Exercise 28 is necessary.

This completes the proof.

Notes

  • The choice $x = \sqrt{2}$ is convenient because $x^2 = 2$ is exactly representable only if $b^p \ge 3$.
  • Smaller precision ($b^p = 2$) allows only the numbers $1$, $2$, and $4$, so $\sqrt{2}$ cannot be represented exactly, producing drift.