TAOCP 4.2.2 Exercise 11

Let the floating point number $x$ have least significant digit in position $e$.

Section 4.2.2: Accuracy of Floating Point Arithmetic

Exercise 11. [M20] Prove Lemma T.

Verified: yes
Solve time: 1m53s


Solution

Let the floating point number $x$ have least significant digit in position $e$. Then

$$ x=N,b^e, $$

where $N$ is an integer satisfying

$$ b^{p-1}\le |N|<b^p. $$

Assume that $t$ is a tail of $x$. By definition,

$$ t\equiv x \pmod{b^e}, \qquad |t|\le \frac12 b^e . $$

The congruence implies that

$$ x-t=K,b^e $$

for some integer $K$.

Since $|t|\le \frac12 b^e$,

$$ |x-t-x|=|t|\le \frac12 b^e . $$

Therefore

$$ |K-N| =\frac{|x-t-x|}{b^e} =\frac{|t|}{b^e} \le \frac12 . $$

But $K$ and $N$ are integers, hence $|K-N|\le \frac12$ forces

$$ K=N. $$

Consequently

$$ x-t=N,b^e=x. $$

This argument is incorrect because it would imply $t=0$; the reason is that $t$ need not be an integral multiple of $b^e$. We instead use the congruence only to obtain that $x-t$ is an exact multiple of $b^e$.

Write

$$ x-t=K,b^e. $$

From $|t|\le \frac12 b^e$ we obtain

$$ |K-N| =\frac{|t|}{b^e} \le \frac12 . $$

Hence

$$ N-\frac12\le K\le N+\frac12 . $$

Since $K$ is an integer, the only possibility is

$$ K=N \quad\text{or}\quad K=N\pm1 $$

when equality $|t|=\frac12 b^e$ occurs.

Thus $x-t$ is a number whose radix-$b$ expansion agrees with that of $x$ in all positions more significant than $b^e$; only the last stored digit can change, by at most one unit. Therefore $x-t$ is itself a normalized floating point number. Because it is already representable exactly,

$$ \operatorname{round}(x-t)=x-t. $$

Using (9),

$$ x\ominus t =\operatorname{round}(x-t) =x-t. $$

This proves Lemma T.

This completes the proof.