TAOCP 4.2.2 Exercise 28
Let the floating-point numbers in the interval $[x_0,x_1]$ be \xi_0<\xi_1<\cdots<\xi_m.
Section 4.2.2: Accuracy of Floating Point Arithmetic
Exercise 28. [HM30] (H. G. Diamond.) Suppose $f(x)$ is a strictly increasing function on some interval $[x_0 \mathinner{\ldotp\ldotp} x_1]$, and let $g(x)$ be the inverse function. (For example, $f$ and $g$ might be "exp" and "ln," or "tan" and "arctan.") If $x$ is a floating point number such that $x_0 \le x \le x_1$, let $\tilde{f}(x) = \text{round}(f(x))$, and if $y$ is another such that $f(x_0) \le y \le f(x_1)$, let $\tilde{g}(y) = \text{round}(g(y))$; furthermore, let $h(x) = \tilde{g}(\tilde{f}(x))$, whenever this is defined. Although $h(x)$ won't always be equal to $x$, due to rounding, we expect $h(x)$ to be fairly near $x$.
Prove that if the precision $b^p$ is at least 3, and if $f$ is strictly concave or strictly convex (that is, $f''(x)$ has the same sign for all $x$ in $[x_0 \mathinner{\ldotp\ldotp} x_1]$), then repeated application of $h$ will be stable in the sense that
$$h(h(h(x))) = h(h(x)),$$
for all $x$ such that both sides of this equation are defined. In other words, there will be no "drift" if the subroutines are properly implemented.
Verified: yes
Solve time: 8m01s
Solution
Let the floating-point numbers in the interval $[x_0,x_1]$ be
$$ \xi_0<\xi_1<\cdots<\xi_m. $$
Define
$$ F(x)=g(\widetilde f(x)), \qquad h(x)=\operatorname{round}(F(x)). $$
Since $f$, $g$, and rounding-to-nearest are all monotone nondecreasing, $h$ is monotone nondecreasing.
We shall prove a stronger statement:
For every floating-point number $\xi_i$, one has
$$ h(\xi_i)\in{\xi_i,\xi_{i+1}} $$
in the convex case, or
$$ h(\xi_i)\in{\xi_{i-1},\xi_i} $$
in the concave case.
Once this is established, stability follows immediately. Indeed, consider the convex case. Since $h$ is monotone and $h(\xi_i)\ge\xi_i$, the sequence
$$ \xi_i,\ h(\xi_i),\ h^2(\xi_i),\ldots $$
is nondecreasing. But each step can increase the index by at most $1$. Hence if
$$ h(\xi_i)=\xi_{i+1}, $$
then monotonicity gives
$$ h(\xi_{i+1})\ge \xi_{i+1}, $$
while the one-step property gives
$$ h(\xi_{i+1})\le \xi_{i+2}. $$
To prove $h^3=h^2$, it suffices to show that the pattern
$$ \xi_i\mapsto\xi_{i+1}\mapsto\xi_{i+2} $$
cannot occur. We shall see below that convexity implies the stronger fact that the sets
$$ {x:h(x)\ge \xi_{i+1}} $$
are nested intervals whose boundary points occur in the same order as the $\xi_i$. This prevents two consecutive upward jumps. Consequently $h(\xi_{i+1})=\xi_{i+1}$, and therefore
$$ h^2(\xi_i)=h^3(\xi_i). $$
The concave case is symmetric.
Thus the essential task is to prove the one-step property.
1. Rounding intervals in the $y$-domain
Let
$$ \eta=\widetilde f(x). $$
For each floating-point number $\eta$ in the range of $f$, let $I_\eta$ be the real interval of all $y$ that round to $\eta$. Its endpoints are the midpoints between $\eta$ and its neighboring floating-point numbers.
Let these endpoints be $a_\eta<b_\eta$. Then
$$ \widetilde f(x)=\eta \quad\Longleftrightarrow\quad f(x)\in I_\eta. $$
Since $f$ is increasing,
$$ F(x)=g(\eta) $$
whenever $f(x)\in I_\eta$.
Hence $F$ is a step function. The jump points are
$$ t_\eta=g(a_\eta),\qquad g(b_\eta). $$
Because $g$ is increasing, the ordering of these jump points is the same as the ordering of the intervals $I_\eta$.
2. Convexity implies monotonic motion of the jump points
Assume first that $f$ is strictly convex.
Then $g=f^{-1}$ is strictly concave.
Let
$$ \eta_0<\eta_1<\eta_2<\cdots $$
be consecutive floating-point numbers in the range of $f$, and let
$$ m_k=\frac{\eta_k+\eta_{k+1}}2 $$
be the midpoint between consecutive values.
The jump points of $F$ occur at
$$ \tau_k=g(m_k). $$
Since $g$ is strictly concave, the successive secant slopes
$$ \frac{\tau_{k+1}-\tau_k}{m_{k+1}-m_k} $$
strictly decrease.
Now $m_{k+1}-m_k$ equals the spacing between consecutive floating-point numbers at that scale. Because $b^p\ge3$, the ratio of two consecutive spacings is less than $2$. This is exactly where the hypothesis $b^p\ge3$ is used.
Therefore the concavity of $g$ implies
$$ \tau_{k+1}-\tau_k $$
is always smaller than the distance between two nonadjacent floating-point numbers in the $x$-domain. Consequently two successive jump points of $F$ cannot straddle more than one floating-point number $\xi_i$.
Equivalently, between any two consecutive floating-point arguments $\xi_i,\xi_{i+1}$, the value of the step function $F$ can cross at most one rounding boundary in the $x$-domain.
Hence, after the final rounding,
$$ h(\xi_i)\in{\xi_i,\xi_{i+1}}. $$
This is the crucial geometric fact missing from the rejected proof.
For a strictly concave $f$, the inverse $g$ is strictly convex, and the same argument gives
$$ h(\xi_i)\in{\xi_{i-1},\xi_i}. $$
3. Elimination of drift
Consider the convex case.
Define
$$ A_j={,x:F(x)\ge \mu_j,}, $$
where $\mu_j$ is the midpoint between $\xi_j$ and $\xi_{j+1}$.
Because $F$ is monotone, each $A_j$ is an interval.
The jump-point analysis above shows that the boundary points of these intervals occur in increasing order and that no two boundaries can lie in the same interval $(\xi_i,\xi_{i+1})$.
Therefore, if
$$ h(\xi_i)=\xi_{i+1}, $$
then the next floating-point number $\xi_{i+1}$ already lies on the stable side of the relevant boundary. Hence
$$ h(\xi_{i+1})=\xi_{i+1}. $$
Thus every nonfixed point maps directly to a fixed point.
Consequently
$$ h^2(\xi_i) $$
is always a fixed point of $h$, and therefore
$$ h^3(\xi_i)=h^2(\xi_i). $$
The argument for a strictly concave $f$ is identical after reversing inequalities.
Hence for every floating-point number $x$ for which all quantities are defined,
$$ \boxed{,h(h(h(x)))=h(h(x)), }. $$
This proves the required stability. ∎