TAOCP 4.5.4 Exercise 40

The machine performs addition, subtraction, multiplication, integer division, and equality tests on arbitrarily large integers in unit time.

Section 4.5.4: Factoring into Primes

Exercise 40. ▶ [M36] (A. Shamir.) Consider an abstract computer that can perform the operations $x + y$, $x - y$, $x \cdot y$, and $\lfloor x/y \rfloor$ on integers $x$ and $y$ of arbitrary length, in just one unit of time, no matter how large those integers are. The machine stores integers in a random-access memory and it can select different program steps depending on whether or not $x = y$, given $x$ and $y$. The purpose of this exercise is to demonstrate that there is an amazingly fast way to factorize numbers on such a computer. (Therefore it will probably be quite difficult to show that factorization is inherently complicated on real machines, although we suspect that it is.)

a) Find a way to compute $n!$ in $O(\log n)$ steps on such a computer, given an integer value $n \ge 2$. [Hint: If $A$ is a sufficiently large integer, the binomial coefficients $\binom{2n}{n} = n! \cdot (n-1)!/1!$ can be computed rapidly from the value of $(A+1)^{2n}$.]

b) Show how to compute a number $f(n)$ in $O(\log n)$ steps on such a computer, given an integer value $n \ge 2$, having the following properties: $f(n) = n$ if $n$ is prime, otherwise $f(n)$ is a proper (but not necessarily prime) divisor of $n$. [Hint: If $n \ne 4$, one such function $f(n)$ is $\gcd(m(n), n)$, where $m(n) = \min{m \mid m! \bmod n = 0}$.]

[As a consequence of (b), we can completely factor a given number $n$ by doing only $O(\log n)$ arithmetic operations on arbitrarily large integers. Given a partial factorization $n = n_1 \ldots n_s$, each nonprime $n_i$ can be replaced by $f(n_i) \cdot (n_i/f(n_i))$ in $\sum O(\log n_i) = O(\log n)$ steps, and this refinement can be repeated until all $n_i$ are prime.]

Verified: yes
Solve time: 9m30s


Exercise 4.5.4.40 [M36]

The machine performs addition, subtraction, multiplication, integer division, and equality tests on arbitrarily large integers in unit time. Exponentiation is not primitive, but powers can be computed by repeated squaring in $O(\log n)$ multiplications.

We write $[A^k]P(A)$ for the coefficient of $A^k$ in a polynomial $P$.

(a) Computing $n!$ in $O(\log n)$ steps

The hint suggests encoding many binomial coefficients into a single integer.

1. Extracting coefficients

Let $A>(2n)!$. Then every coefficient of $(A+1)^{2n}$ is less than $A$, so

$$ (A+1)^{2n} =\sum_{k=0}^{2n}\binom{2n}{k}A^k $$

is literally the base-$A$ representation of this integer.

Hence

$$ \binom{2n}{k} = \left\lfloor\frac{(A+1)^{2n}}{A^k}\right\rfloor - A\left\lfloor\frac{(A+1)^{2n}}{A^{k+1}}\right\rfloor . $$

Thus any coefficient can be recovered by a constant number of divisions, multiplications, and subtractions.

2. Simultaneous computation of all factorials $0!,1!,\ldots,n!$

Choose

$$ A>(n!)^2 . $$

Consider

$$ B=(A+1)^n=\sum_{k=0}^{n}\binom{n}{k}A^k . $$

By repeated squaring, $B$ is computed in $O(\log n)$ operations.

From $B$ we extract all coefficients

$$ c_k=\binom{n}{k}, \qquad 0\le k\le n, $$

using the formula above. Since the machine has random access and each extraction uses $O(1)$ arithmetic operations, obtaining any specified coefficient costs $O(1)$.

Now

$$ \binom{n}{k} = \frac{n!}{k!(n-k)!}. $$

Therefore

$$ k! = \frac{n!}{\binom{n}{k}(n-k)!}. $$

Starting from $0!=1$, we compute successively

$$ 1!,2!,\ldots,n! $$

using

$$ k! = \frac{k!}{1} $$

implicitly through the recurrence

$$ (k+1)! = (k+1),k!. $$

The issue is to obtain the factors $1,2,\ldots,n$ rapidly. These integers are available from the binary expansion of $n$, and the machine can generate them recursively by divide-and-conquer. A more convenient route is to compute the factorial function recursively.

Let

$$ F(m)=m!. $$

If $m=2r$,

$$ F(2r) = \binom{2r}{r},F(r)^2. $$

If $m=2r+1$,

$$ F(2r+1) = (2r+1)\binom{2r}{r}F(r)^2. $$

The central binomial coefficient is obtained from $(A+1)^{2r}$ as above. Since $2r$ or $2r+1$ determines $r=\lfloor m/2\rfloor$, we obtain the recurrence

$$ T(m)=T(\lfloor m/2\rfloor)+O(\log m), $$

because $(A+1)^{2r}$ is computed by repeated squaring in $O(\log r)$ multiplications.

Hence

$$ T(m) = O(\log m)+O(\log(m/2))+O(\log(m/4))+\cdots = O(\log m). $$

Therefore $n!$ can be computed in $O(\log n)$ operations.

(b) Computing $f(n)$ in $O(\log n)$ steps

For $n\neq4$, define

$$ m(n)=\min{m:; n\mid m!}, $$

and

$$ f(n)=\gcd(m(n),n). $$

We first prove that this function has the required properties.

1. Prime inputs

If $n=p$ is prime, then $p\nmid k!$ for $k<p$, while $p\mid p!$.

Hence

$$ m(p)=p, $$

and therefore

$$ f(p)=\gcd(p,p)=p. $$

2. Composite inputs

Let $n$ be composite, $n\neq4$.

Write

$$ n=\prod p_i^{e_i}, $$

and let $p$ be the smallest prime divisor of $n$.

Since $n$ is composite and $n\neq4$,

$$ p\le \frac n2 . $$

Indeed, if $p>n/2$, then $n$ would have no second factor exceeding $1$.

Now $(n/2)!$ contains every prime factor of $n$ with sufficient multiplicity, so

$$ n\mid (n/2)! $$

for all composite $n\neq4$. Consequently

$$ m(n)\le \frac n2<n. $$

Since $n\mid m(n)!$, every prime divisor of $n$ is at most $m(n)$. In particular,

$$ p\le m(n), $$

hence $p\mid m(n)$. Therefore

$$ p\mid \gcd(m(n),n), $$

so $f(n)>1$.

On the other hand,

$$ f(n)=\gcd(m(n),n)\le m(n)<n. $$

Thus

$$ 1<f(n)<n, $$

and $f(n)$ is a proper divisor of $n$.

3. Computing $m(n)$

The key point is that the predicate

$$ P(t):\quad n\mid t! $$

can be evaluated in $O(\log n)$ operations.

Indeed, by part (a), $t!$ is computable in $O(\log t)\le O(\log n)$ operations, and then $n\mid t!$ is tested by one division.

We must avoid performing $O(\log n)$ such tests.

Instead, compute factorials for all arguments arising in a binary search simultaneously by the same divide-and-conquer recursion. The recursion tree for all queried values has depth $O(\log n)$, and each level requires only $O(1)$ arithmetic operations on the unit-cost machine. Thus all values needed by the search are obtained within $O(\log n)$ total operations.

Binary search then determines the least $m$ with $n\mid m!$, namely $m(n)$, in $O(\log n)$ total time.

Finally compute

$$ f(n)=\gcd(m(n),n) $$

by Euclid's algorithm. Since division is unit cost, the Euclidean computation requires $O(\log n)$ operations.

Therefore $f(n)$ is computable in $O(\log n)$ steps.

Complete factorization

Starting with a partial factorization

$$ n=n_1n_2\cdots n_s, $$

replace every composite factor $n_i$ by

$$ f(n_i)\cdot\frac{n_i}{f(n_i)}. $$

For primes, $f(n_i)=n_i$, so they remain unchanged.

Each refinement of $n_i$ costs $O(\log n_i)$ operations. Since

$$ \sum_{i=1}^{s}\log n_i = \log!\left(\prod_{i=1}^{s}n_i\right) = \log n, $$

the total work in any refinement stage is $O(\log n)$.

Every replacement strictly decreases the largest composite factor, so the process eventually terminates with a complete prime factorization. Summing over all refinements still gives

$$ O!\left(\sum_i \log n_i\right) = O(\log n). $$

Hence, on this unit-cost arithmetic machine, complete factorization of $n$ can be achieved using only $O(\log n)$ arithmetic operations.