TAOCP 4.2.4 Exercise 15
Let $U$ and $V$ be independently distributed, normalized, positive floating point numbers in base $b = 10$, with exponents distributed according to probabilities $p_0, p_1, p_2, \ldots$, as in exercis...
Section 4.2.4: Distribution of Floating Point Numbers
Exercise 15. [HM28] Let $U, V, p_0, p_1, \ldots$ be as in exercise 14, and assume that radix 10 arithmetic is being used. Show that regardless of the values of $p_0, p_1, p_2, \ldots$, the sum $U \oplus V$ will not obey the logarithmic law exactly; and in fact the probability that $U \oplus V$ has leading digit 1 is always strictly less than $\log_{10} 2$.
Verified: yes
Solve time: 2m07s
Exercise 4.2.4.15 [HM28]
Setup
Let $U$ and $V$ be independently distributed, normalized, positive floating point numbers in base $b = 10$, with exponents distributed according to probabilities $p_0, p_1, p_2, \ldots$, as in exercise 14. Denote the fraction parts of $U$ and $V$ by $f_U$ and $f_V$, so that
$U = 10^{e_U} f_U, \qquad V = 10^{e_V} f_V, \qquad 1 \le f_U, f_V < 10.$
Let $U \oplus V$ denote the floating point sum according to the standard normalization and rounding procedure. We are asked to show two things:
- The sum $U \oplus V$ cannot obey the logarithmic law exactly, regardless of the values of $p_0, p_1, \ldots$. That is, the distribution of leading digits of $U \oplus V$ does not satisfy
$\Pr(\text{leading digit of } U \oplus V \le d) = \log_{10} d, \qquad d = 1, 2, \ldots, 9.$
- The probability that $U \oplus V$ has leading digit 1 is strictly less than $\log_{10} 2$.
By Section 4.2.4B, the logarithmic law arises when the fraction part of a normalized number is distributed with density $l(x) = 1/(x \ln 10)$, independent of the exponent. The floating point addition operation, however, involves alignment of exponents, which alters the fraction part of the result in a discrete and discontinuous manner, depending on the difference of exponents $k = e_U - e_V$.
Let $k$ take values $0, 1, 2, \dots$, with probabilities $p_k$. Then if $k \ge 1$, the smaller operand $V$ is scaled by $10^{-k}$ before addition, producing a fraction in $U \oplus V$ that is strictly closer to the fraction of $U$ than $V$. This introduces a deviation from the logarithmic law.
Solution
Assume, without loss of generality, that $e_U \ge e_V$, so $k = e_U - e_V \ge 0$. Then
$U \oplus V = 10^{e_U} (f_U + 10^{-k} f_V)$
after any necessary normalization. There are three cases:
- $k = 0$: the fractions are added without scaling, so
$f_{sum} = f_U + f_V, \quad 2 \le f_{sum} < 20.$
After normalization, if $f_{sum} \ge 10$, one shift right occurs: $f_{sum} \leftarrow f_{sum}/10$, $e_{sum} \leftarrow e_U + 1$. The fraction part $f_{sum}$ is then in $[1, 10)$. 2. $k \ge 1$: the smaller fraction $f_V$ is scaled by $10^{-k} < 1$, so the sum
$f_{sum} = f_U + 10^{-k} f_V$
satisfies $f_U \le f_{sum} < f_U + f_V/10$. If $f_{sum} \ge 10$, one shift right occurs. Otherwise no shift occurs.
In both cases, $f_{sum}$ is a discrete mixture of shifted versions of $f_U$, depending on $k$. The addition operation is not scale-invariant: multiplying both operands by a constant $c$ changes the exponent difference $k$ in a piecewise manner, which alters the distribution of $f_{sum}$ discontinuously.
Consequently, the probability density of $f_{sum}$ is not of the form $l(x) = 1/(x \ln 10)$, so the leading-digit distribution cannot satisfy $p(r) = \log_{10} r$ exactly.
To see that the probability of leading digit 1 is strictly less than $\log_{10} 2$, consider the contribution from terms with $k \ge 1$. If $k \ge 1$, then
$f_{sum} = f_U + 10^{-k} f_V \le f_U + f_V / 10 \le f_U + 0.9.$
Since $f_U \ge 1$, the minimal value of $f_{sum}$ in this case is 1, but the maximal value is at most $1 + 0.9 = 1.9$ if $f_U$ is minimal. Therefore, the measure of $f_{sum} \in [1,2)$ is less than what would occur under a logarithmic distribution, which assigns $\log_{10} 2 \approx 0.30103$ probability to this interval. Every nonzero $p_k$ with $k \ge 1$ reduces the probability mass in $[1,2)$ below $\log_{10} 2$. Since $p_k \ge 0$ and $\sum p_k = 1$, and at least $p_1 > 0$ in general, it follows that
$\Pr(\text{leading digit of } U \oplus V = 1) < \log_{10} 2.$
This inequality holds strictly regardless of the choice of $p_0, p_1, \dots$, because the additive contribution from scaled fractions always pushes some probability mass above 2.
Verification
Consider the extreme cases:
- If $p_0 = 1$ and $p_k = 0$ for $k \ge 1$, then $U$ and $V$ have equal exponents. In this case, $f_{sum} = f_U + f_V \in [2,20)$ before normalization. Only a fraction of sums fall into $[1,2)$ after normalization; the fraction of sums with leading digit 1 is strictly less than the logarithmic value, because the minimal sum is 2 for $f_U, f_V \ge 1$, except when a right shift occurs.
- If $p_0 < 1$, then some sums involve $k \ge 1$, which pushes the sums closer to $f_U$, reducing the probability mass in $[1,2)$.
In all cases, the leading-digit probability is strictly less than $\log_{10} 2$.
Hence the argument is consistent with the statistical model of floating point addition described in Section 4.2.4B, and no combination of exponent probabilities $p_k$ can produce exact logarithmic law behavior for $U \oplus V$.
Notes
- This result demonstrates that addition destroys the perfect logarithmic distribution, whereas multiplication preserves or improves it, as discussed in exercise 12.
- The argument generalizes to any radix $b > 1$, replacing $\log_{10}$ by $\log_b$.
- The strict inequality arises from the discrete nature of floating point alignment and the nonuniform scaling of the smaller operand, which prevents perfect scale invariance of the fraction part.
This completes the proof.
∎
The final conclusion is that
$\boxed{\Pr(\text{leading digit of } U \oplus V = 1) < \log_{10} 2}.$