TAOCP 3.4.1 Exercise 33

The reviewer's principal objection is incorrect.

Section 3.4.1: Numerical Distributions

Exercise 33. [**] [29] Algorithms P, M, F, and R generate normal deviates by consuming an unknown number of uniform random variables $U_1, U_2, \ldots$ How can they be modified so that the output is a function of just one $U$?

Verified: yes
Solve time: 5m44s


The reviewer's principal objection is incorrect. A deterministic function of a single random variable can indeed produce several independent random variables. Independence is a property of the induced joint distribution, not of the mechanism by which the variables are computed. The classical digit-splitting construction is a measure-preserving isomorphism between $[0,1]$ and $[0,1]^\infty$ (up to sets of measure $0$).

Let

$$ U=0.b_1b_2b_3\cdots $$

be the binary expansion of a uniform random variable on $[0,1]$, choosing the nonterminating expansion when two binary expansions are possible.

For $j\ge1$, define

$$ U_j=0.b_j,b_{j+r},b_{j+2r}\cdots , $$

where $r$ is any positive integer. More conveniently, since the algorithms may consume an arbitrary number of uniforms, enumerate the bits by a two-dimensional array. Let

$$ b_1,b_2,b_3,\ldots $$

be arranged as

$$ \begin{array}{cccc} b_{1,1}&b_{1,2}&b_{1,3}&\cdots\ b_{2,1}&b_{2,2}&b_{2,3}&\cdots\ b_{3,1}&b_{3,2}&b_{3,3}&\cdots\ \vdots&\vdots&\vdots& \end{array} $$

using any fixed bijection between $\mathbb N$ and $\mathbb N^2$. Then define

$$ U_i=0.b_{i,1}b_{i,2}b_{i,3}\cdots , \qquad i\ge1. $$

We claim that $U_1,U_2,\ldots$ are independent uniform random variables.

To prove this, fix $m\ge1$. For each $i\le m$, let

$$ I_i=\left[\frac{a_i}{2^{n_i}},\frac{a_i+1}{2^{n_i}}\right) $$

be a dyadic interval. The event $U_i\in I_i$ specifies exactly the first $n_i$ bits of $U_i$. Since the bits used in different $U_i$ are disjoint, the event

$$ U_1\in I_1,\ldots,U_m\in I_m $$

specifies

$$ n_1+\cdots+n_m $$

distinct bits of $U$. Because the binary digits of a uniform random variable are independent Bernoulli $(\tfrac12)$ variables, the probability of this event is

$$ 2^{-(n_1+\cdots+n_m)} = \prod_{i=1}^{m}2^{-n_i} = \prod_{i=1}^{m}\Pr(U_i\in I_i). $$

Hence the variables $U_1,\ldots,U_m$ are independent on the generating class of dyadic intervals. By standard extension arguments, they are independent on all Borel sets. Each $U_i$ is uniformly distributed on $[0,1]$, since its binary digits are independent unbiased bits.

Therefore a single uniform random variable $U$ determines an infinite sequence

$$ U_1,U_2,U_3,\ldots $$

of independent uniform random variables having exactly the same joint distribution as the sequence ordinarily supplied to Algorithms P, M, F, and R.

If an algorithm uses the uniforms $U_1,U_2,\ldots$ and eventually stops after a random number of them, we simply generate the required sequence from the bits of $U$ as above and run the algorithm unchanged. The resulting normal deviate is then a measurable function of the single variable $U$.

Hence Algorithms P, M, F, and R can all be modified so that their output is a function of one uniform random variable alone. ∎