TAOCP 3.5 Exercise 18

Section 3.5: What Is a Random Sequence?

Exercise 18. ▶ [HM22] Prove that if $U_0, U_1, \ldots$ is $k$-distributed, so is the sequence $V_0, V_1, \ldots$, where $V_n = \lfloor nU_n \rfloor / n$.

Verified: no
Solve time: 3m44s

Solution

We are asked to prove:

If $(U_n)$ is $k$-distributed, so is the sequence

$$ V_n = \frac{\lfloor n U_n \rfloor}{n}. $$

We proceed carefully using a limiting argument.

Step 1: Recall the definition of $k$-distribution

A sequence $(U_n)$ of real numbers in $[0,1)$ is $k$-distributed if, for every $k$-tuple of intervals

$$ [a_1,b_1), \dots, [a_k,b_k) \subset [0,1), $$

the proportion of indices $n$ for which

$$ (U_n, U_{n+1}, \dots, U_{n+k-1}) \in [a_1,b_1) \times \cdots \times [a_k,b_k) $$

tends to

$$ \prod_{i=1}^k (b_i - a_i) $$

as the number of terms considered goes to infinity.

Step 2: Express the event ${V_n \in [a,b)}$

For each $n \ge 1$,

$$ V_n = \frac{\lfloor n U_n \rfloor}{n}. $$

Then, for an interval $[a,b) \subset [0,1)$,

$$ V_n \in [a,b) \quad \iff \quad \frac{\lfloor n U_n \rfloor}{n} \in [a,b) \quad \iff \quad U_n \in \bigcup_{j = \lceil n a \rceil}^{\lfloor n b \rfloor - 1} \left[\frac{j}{n}, \frac{j+1}{n}\right). $$

Hence, membership of $V_n$ in $[a,b)$ corresponds to $U_n$ lying in a finite union of subintervals of length $1/n$.

Step 3: Approximate the measure of intervals

Let $[a_1,b_1), \dots, [a_k,b_k) \subset [0,1)$ be any $k$-tuple of intervals. For each $n$, consider the event

$$ (V_n, \dots, V_{n+k-1}) \in [a_1,b_1) \times \cdots \times [a_k,b_k). $$

This is equivalent to

$$ (U_n, \dots, U_{n+k-1}) \in \prod_{i=1}^k \bigcup_{j_i = \lceil n a_i \rceil}^{\lfloor n b_i \rfloor - 1} \left[\frac{j_i}{n}, \frac{j_i+1}{n}\right) =: \bigcup_{\mathbf{j}} R_{\mathbf{j}}, $$

where $\mathbf{j} = (j_1,\dots,j_k)$ and

$$ R_{\mathbf{j}} = \left[\frac{j_1}{n}, \frac{j_1+1}{n}\right) \times \cdots \times \left[\frac{j_k}{n}, \frac{j_k+1}{n}\right). $$

Each $R_{\mathbf{j}}$ is a $k$-dimensional rectangle of volume $1/n^k$, and the union is over all tuples $\mathbf{j}$ such that $\lceil n a_i \rceil \le j_i \le \lfloor n b_i \rfloor - 1$ for $i = 1,\dots,k$.

Step 4: Use a limiting argument

Let

$$ S_n := \frac{1}{N} #\Big{ m < N : (V_m, \dots, V_{m+k-1}) \in [a_1,b_1) \times \cdots \times [a_k,b_k) \Big}. $$

Equivalently,

$$ S_n = \frac{1}{N} #\Big{ m < N : (U_m, \dots, U_{m+k-1}) \in \bigcup_{\mathbf{j}} R_{\mathbf{j}} \Big}. $$

Since $(U_n)$ is $k$-distributed, for any fixed $k$-dimensional rectangle, the limiting proportion of $m$ such that $(U_m, \dots, U_{m+k-1})$ lies in that rectangle tends to its volume.

Here, each $\bigcup_{\mathbf{j}} R_{\mathbf{j}}$ is a Riemann sum approximation of $[a_1,b_1) \times \cdots \times [a_k,b_k)$:

$$ \text{vol}\Big(\bigcup_{\mathbf{j}} R_{\mathbf{j}}\Big) = \prod_{i=1}^k \frac{\lfloor n b_i \rfloor - \lceil n a_i \rceil}{n} \to \prod_{i=1}^k (b_i - a_i) \quad \text{as } n \to \infty. $$

For large $n$, the union $\bigcup_{\mathbf{j}} R_{\mathbf{j}}$ differs from $[a_1,b_1) \times \cdots \times [a_k,b_k)$ by a "boundary layer" of volume at most $k/n$, which vanishes as $n \to \infty$.

Hence, by taking $n \to \infty$ first and then $N \to \infty$, we obtain

$$ \lim_{N \to \infty} S_n = \prod_{i=1}^k (b_i - a_i). $$

Step 5: Conclude $k$-distribution of $(V_n)$

We have shown that for any $k$-tuple of intervals $[a_i,b_i)$, the limiting proportion of $n$ such that

$$ (V_n, \dots, V_{n+k-1}) \in [a_1,b_1) \times \cdots \times [a_k,b_k) $$

is exactly

$$ \prod_{i=1}^k (b_i - a_i). $$

This verifies the definition of $k$-distribution for the sequence $(V_n)$.

$$ \boxed{\text{If $(U_n)$ is $k$-distributed, then $(V_n)$ is $k$-distributed.}} $$

∎

Step 6: Remarks

The crucial point is that the union of small rectangles $\bigcup_{\mathbf{j}} R_{\mathbf{j}}$ approximates the target interval exactly in volume as $n \to \infty$.
This avoids the error in the previous solution, which incorrectly applied $k$-distribution to a number of rectangles growing with $n$.
The limiting argument is essentially a Riemann sum approximation, which rigorously justifies the passage from the discrete sequence $V_n$ to the continuous measure.