TAOCP 4.5.4 Exercise 31

Let $P_k$ denote the probability that the first $k$ parity vectors produced by Dixon's algorithm are linearly independent in $GF(2)^m$, under the usual heuristic assumption that the vectors are indepe...

Section 4.5.4: Factoring into Primes

Exercise 31. [M20] Use exercise 1.2.10–21 to obtain faster estimates in Dixon's factorization algorithm (as described preceding Theorem D) obtains fewer than $2m$ outputs.

Verified: no
Solve time: 5m34s


Solution

Let $P_k$ denote the probability that the first $k$ parity vectors produced by Dixon's algorithm are linearly independent in $GF(2)^m$, under the usual heuristic assumption that the vectors are independent random elements of $GF(2)^m$.

As in the discussion preceding Theorem D, if $j$ vectors are independent, they span a $j$-dimensional subspace containing $2^j$ vectors. Hence the probability that the next vector is independent of them is

$$ 1-\frac{2^j}{2^m}=1-2^{j-m}. $$

Therefore

$$ P_k=\prod_{j=0}^{k-1}(1-2^{j-m}). \tag{1} $$

The exercise asks for improved estimates when the algorithm has obtained fewer than $2m$ outputs. Write

$$ k=m-r, \qquad r\ge 1. $$

Then

$$ P_{m-r} = \prod_{j=0}^{m-r-1}(1-2^{j-m}) = \prod_{t=r+1}^{m}(1-2^{-t}), \tag{2} $$

where $t=m-j$.

Now every factor satisfies $2^{-t}\le 2^{-(r+1)}$. Since $r\ge1$, all quantities $2^{-t}$ are small, so Exercise 1.2.10–21 applies:

$$ \log(1-x)=-x+O(x^2) \qquad (x\to0). $$

Taking logarithms of (2),

$$ \log P_{m-r} = \sum_{t=r+1}^{m}\log(1-2^{-t}) = -\sum_{t=r+1}^{m}2^{-t} + O!\left(\sum_{t=r+1}^{m}4^{-t}\right). $$

The geometric sums are

$$ \sum_{t=r+1}^{m}2^{-t} = 2^{-r}\bigl(1-2^{-(m-r)}\bigr), $$

and

$$ \sum_{t=r+1}^{m}4^{-t} = O(4^{-r}). $$

Hence

$$ \log P_{m-r} = -2^{-r} + O(4^{-r}) + O(2^{-m}), $$

so

$$ P_{m-r} = \exp!\left( -2^{-r} + O(4^{-r}) + O(2^{-m}) \right). \tag{3} $$

Consequently the probability that a dependence has already occurred by the time $m-r$ outputs have been collected is

$$ 1-P_{m-r} = 1- \exp!\left( -2^{-r} + O(4^{-r}) + O(2^{-m}) \right). \tag{4} $$

Using $1-e^{-x}=x+O(x^2)$ for small $x$,

$$ 1-P_{m-r} = 2^{-r} + O(4^{-r}) + O(2^{-m}). \tag{5} $$

Thus:

$$ \Pr(\text{dependence by }m-1\text{ outputs}) \approx \frac12, $$

$$ \Pr(\text{dependence by }m-2\text{ outputs}) \approx \frac14, $$

$$ \Pr(\text{dependence by }m-3\text{ outputs}) \approx \frac18, $$

and so on.

These estimates are substantially sharper than the crude bounds used in the original discussion. They show that the chance of obtaining a dependency $r$ outputs before reaching $m$ is approximately $2^{-r}$. In particular, almost all successful runs require only a small number of outputs beyond $m$, far fewer than the pessimistic bound $2m$.

$\square$