TAOCP 3.4.2 Exercise 6
Let $T$ denote the value of $t$ when Algorithm S terminates.
Section 3.4.2: Random Sampling and Shuffling
Exercise 6. [M24] What is the standard deviation of the value computed in exercise 5?
Verified: no
Solve time: 4m31s
Solution
Let $T$ denote the value of $t$ when Algorithm S terminates. As in Exercise 5, $T$ is the largest element of a uniformly chosen $n$-subset of ${1,2,\dots,N}$. Its distribution is
$$ \Pr{T=t} = \frac{\binom{t-1}{n-1}}{\binom{N}{n}}, \qquad n \le t \le N. $$
We aim to compute the standard deviation $\sigma_T = \sqrt{\operatorname{Var}(T)}$.
Step 1: First moment
Using the identity
$$ t \binom{t-1}{n-1} = n \binom{t}{n}, $$
we obtain
$$ E(T) = \frac{1}{\binom{N}{n}} \sum_{t=n}^{N} t \binom{t-1}{n-1} = \frac{n}{\binom{N}{n}} \sum_{t=n}^{N} \binom{t}{n}. $$
By the standard hockey-stick identity,
$$ \sum_{t=n}^{N} \binom{t}{n} = \binom{N+1}{n+1}. $$
Therefore
$$ E(T) = \frac{n}{\binom{N}{n}} \binom{N+1}{n+1} = \frac{n(N+1)}{n+1}. $$
This agrees with Exercise 5.
Step 2: Second moment
Write
$$ t^2 = t(t-1) + t. $$
Then
$$ t^2 \binom{t-1}{n-1} = t(t-1) \binom{t-1}{n-1} + t \binom{t-1}{n-1}. $$
We know
$$ t \binom{t-1}{n-1} = n \binom{t}{n}, \qquad t(t-1)\binom{t-1}{n-1} = n(n-1)\binom{t}{n}. $$
Hence
$$ t^2 \binom{t-1}{n-1} = n(n-1) \binom{t}{n} + n \binom{t}{n} = n(n-1+1) \binom{t}{n} = n^2 \binom{t}{n}. $$
We must be careful: we need to check coefficients. A more precise derivation is as follows.
- Start with $t(t-1) \binom{t-1}{n-1}$. Using the identity
$$ t(t-1) \binom{t-1}{n-1} = n(n-1) \binom{t}{n}. $$
- Then $t^2 \binom{t-1}{n-1} = t(t-1) \binom{t-1}{n-1} + t \binom{t-1}{n-1} = n(n-1) \binom{t}{n} + n \binom{t}{n} = n(n) \binom{t}{n} = n^2 \binom{t}{n}.$
Thus the computation is correct.
Then the second moment is
$$ E(T^2) = \frac{1}{\binom{N}{n}} \sum_{t=n}^{N} t^2 \binom{t-1}{n-1} = \frac{n^2}{\binom{N}{n}} \sum_{t=n}^{N} \binom{t}{n}. $$
Using the hockey-stick identity:
$$ \sum_{t=n}^{N} \binom{t}{n} = \binom{N+1}{n+1}. $$
Hence
$$ E(T^2) = \frac{n^2}{\binom{N}{n}} \binom{N+1}{n+1} = n^2 \frac{(N+1)!/( (n+1)!(N-n)! )}{N!/ (n!(N-n)! )} = n^2 \frac{(N+1)! n!}{(n+1)! N!} = n \frac{N+1}{n+1} \cdot n = \frac{n(N+1)n}{n+1} = \frac{n^2 (N+1)}{n+1}. $$
Step 3: Correct approach using the standard combinatorial formula
The previous approach miscounts the second moment. The classical correct formula uses the identity:
$$ \sum_{t=n}^{N} t^2 \binom{t-1}{n-1} = n \sum_{t=n}^{N} t \binom{t}{n}. $$
Then, applying the known identity
$$ \sum_{t=n}^{N} t \binom{t}{n} = (n+1) \binom{N+2}{n+2} - \binom{N+1}{n+1}, $$
we obtain
$$ E(T^2) = \frac{n}{\binom{N}{n}} \left[ (n+1) \binom{N+2}{n+2} - \binom{N+1}{n+1} \right]. $$
Simplify the binomial ratios:
$$ \frac{\binom{N+2}{n+2}}{\binom{N}{n}} = \frac{(N+2)(N+1)}{(n+2)(n+1)}, \qquad \frac{\binom{N+1}{n+1}}{\binom{N}{n}} = \frac{N+1}{n+1}. $$
Therefore
$$ E(T^2) = n \left[ (n+1) \frac{(N+2)(N+1)}{(n+2)(n+1)} - \frac{N+1}{n+1} \right] = n \left[ \frac{(N+2)(N+1)}{n+2} - \frac{N+1}{n+1} \right]. $$
Combine the terms:
$$ E(T^2) = n(N+1) \left( \frac{N+2}{n+2} - \frac{1}{n+1} \right) = n(N+1) \frac{(N+2)(n+1) - (n+2)}{(n+1)(n+2)} = n(N+1) \frac{nN + n}{(n+1)(n+2)} = \frac{n N (N+1)}{n+2}. $$
Check: careful. Multiply numerator:
$$ (N+2)(n+1) - (n+2) = N n + 2 n + N + 2 - n - 2 = n N + n + N = nN + N + n. $$
Yes, factor:
$$ nN + N + n = (N+1) n + N. $$
Then
$$ E(T^2) = n (N+1) \frac{(N+1) n + N}{(n+1)(n+2)}. $$
This matches the corrected formula.
Step 4: Variance
We have
$$ \operatorname{Var}(T) = E(T^2) - (E(T))^2 = n(N+1) \frac{(N+1)n + N}{(n+1)(n+2)} - \left( \frac{n(N+1)}{n+1} \right)^2. $$
Simplify:
$$ \operatorname{Var}(T) = \frac{n(N+1)((N+1)n + N)}{(n+1)(n+2)} - \frac{n^2 (N+1)^2}{(n+1)^2} = \frac{n(N+1)}{(n+1)^2(n+2)} \left[ ((N+1)n + N)(n+1) - n (N+1)(n+2) \right]. $$
Compute the bracket:
$$ ((N+1)n + N)(n+1) = (N+1)n(n+1) + N(n+1) = n(n+1)(N+1) + N(n+1) $$
$$ n(N+1)(n+2) = n(n+2)(N+1) $$
Subtract:
$$ [n(n+1)(N+1) + N(n+1)] - n(n+2)(N+1) = n(n+1)(N+1) + N(n+1) - n(n+2)(N+1) = n(n+1)(N+1) - n(n+2)(N+1) + N(n+1) = - n(N+1) + N(n+1) = N (n+1) - n(N+1) = N + N n - n N - n = N - $$