TAOCP 4.5.4 Exercise 31
Let $P_k$ denote the probability that the first $k$ parity vectors produced by Dixon's algorithm are linearly independent in $GF(2)^m$, under the usual heuristic assumption that the vectors are indepe...
Section 4.5.4: Factoring into Primes
Exercise 31. [M20] Use exercise 1.2.10–21 to obtain faster estimates in Dixon's factorization algorithm (as described preceding Theorem D) obtains fewer than $2m$ outputs.
Verified: no
Solve time: 5m34s
Solution
Let $P_k$ denote the probability that the first $k$ parity vectors produced by Dixon's algorithm are linearly independent in $GF(2)^m$, under the usual heuristic assumption that the vectors are independent random elements of $GF(2)^m$.
As in the discussion preceding Theorem D, if $j$ vectors are independent, they span a $j$-dimensional subspace containing $2^j$ vectors. Hence the probability that the next vector is independent of them is
$$ 1-\frac{2^j}{2^m}=1-2^{j-m}. $$
Therefore
$$ P_k=\prod_{j=0}^{k-1}(1-2^{j-m}). \tag{1} $$
The exercise asks for improved estimates when the algorithm has obtained fewer than $2m$ outputs. Write
$$ k=m-r, \qquad r\ge 1. $$
Then
$$ P_{m-r} = \prod_{j=0}^{m-r-1}(1-2^{j-m}) = \prod_{t=r+1}^{m}(1-2^{-t}), \tag{2} $$
where $t=m-j$.
Now every factor satisfies $2^{-t}\le 2^{-(r+1)}$. Since $r\ge1$, all quantities $2^{-t}$ are small, so Exercise 1.2.10–21 applies:
$$ \log(1-x)=-x+O(x^2) \qquad (x\to0). $$
Taking logarithms of (2),
$$ \log P_{m-r} = \sum_{t=r+1}^{m}\log(1-2^{-t}) = -\sum_{t=r+1}^{m}2^{-t} + O!\left(\sum_{t=r+1}^{m}4^{-t}\right). $$
The geometric sums are
$$ \sum_{t=r+1}^{m}2^{-t} = 2^{-r}\bigl(1-2^{-(m-r)}\bigr), $$
and
$$ \sum_{t=r+1}^{m}4^{-t} = O(4^{-r}). $$
Hence
$$ \log P_{m-r} = -2^{-r} + O(4^{-r}) + O(2^{-m}), $$
so
$$ P_{m-r} = \exp!\left( -2^{-r} + O(4^{-r}) + O(2^{-m}) \right). \tag{3} $$
Consequently the probability that a dependence has already occurred by the time $m-r$ outputs have been collected is
$$ 1-P_{m-r} = 1- \exp!\left( -2^{-r} + O(4^{-r}) + O(2^{-m}) \right). \tag{4} $$
Using $1-e^{-x}=x+O(x^2)$ for small $x$,
$$ 1-P_{m-r} = 2^{-r} + O(4^{-r}) + O(2^{-m}). \tag{5} $$
Thus:
$$ \Pr(\text{dependence by }m-1\text{ outputs}) \approx \frac12, $$
$$ \Pr(\text{dependence by }m-2\text{ outputs}) \approx \frac14, $$
$$ \Pr(\text{dependence by }m-3\text{ outputs}) \approx \frac18, $$
and so on.
These estimates are substantially sharper than the crude bounds used in the original discussion. They show that the chance of obtaining a dependency $r$ outputs before reaching $m$ is approximately $2^{-r}$. In particular, almost all successful runs require only a small number of outputs beyond $m$, far fewer than the pessimistic bound $2m$.
$\square$