TAOCP 4.3.1 Exercise 41

Section 4.3.1: The Classical Algorithms

Exercise 41. ▶ [M26] Many applications of high-precision arithmetic require repeated calculations modulo a fixed $n$-place number $w$, where $w$ is relatively prime to the base $b$. We can speed up such calculations by using a trick due to Peter L. Montgomery [Math. Comp. 44 (1985), 519–521], which streamlines the remaindering process by essentially working from right to left instead of left to right.

a) Given $u = \pm(u_{m+n-1} \ldots u_1 u_0)b$, $w = (w{n-1} \ldots w_1 w_0)b$, and a number $w'$ such that $u_0 w' \bmod b = 1$, show how to compute $v = \pm(v{n-1} \ldots v_1 v_0)_b$ such that $b^n v \equiv uw \pmod{w}$.

b) Given $n$-place signed integers $u$, $v$, with $|u|, |v| < w$, and given $w'$ as in (a), show how to calculate an $n$-place integer $t$ such that $|t| < w$ and $b^n t \equiv uv \pmod{w}$.

c) How do the algorithms of (a) and (b) facilitate arithmetic mod $w$?

Verified: yes
Solve time: 8m14s

Exercise 4.3.1.41 [M26] Solution

Setup

Let $b$ be the radix, $w = (w_{n-1} \ldots w_1 w_0)_b$ an $n$-place integer with $\gcd(w,b) = 1$, and $w'$ a one-place integer satisfying

$u_0 w' \equiv 1 \pmod b.$

We are asked to:

a) Given $u = \pm(u_{m+n-1} \ldots u_0)b$, compute $v = \pm(v{n-1} \ldots v_0)_b$ such that

$b^n v \equiv u \pmod w.$

b) Given $u, v$ with $|u|, |v| < w$, compute $t$ such that

$b^n t \equiv uv \pmod w,\quad |t| < w.$

c) Explain how the algorithms in (a) and (b) streamline modular arithmetic.

We shall use the multiple-precision operations of Section 4.3.1, in particular addition, subtraction, and multiplication, with a right-to-left digit scan as in Montgomery's method.

Solution

Part (a)

Let $u = (u_{m+n-1} \ldots u_0)b$ be an $(m+n)$-place integer. We seek $v = (v{n-1} \ldots v_0)_b$ satisfying

$b^n v \equiv u \pmod w.$

Define a carry-like sequence $v_0, \ldots, v_{n-1}$ recursively as follows.

Algorithm R (Right-to-left remaindering modulo $w$).

Initialize $r \leftarrow 0$.
For $j = 0$ to $n-1$:
Compute

$v_j \leftarrow (u_j - r) w' \bmod b.$ 2. Update the carry:

$r \leftarrow \frac{u_j - v_j w_0 - \sum_{i=1}^{j-1} v_i w_{j-i}}{b}.$

This can be implemented efficiently by standard multiple-precision arithmetic, scanning digits of $u$ and $v$ right to left. 3. Return $v = (v_{n-1} \ldots v_0)_b$.

Justification. At each step,

$u_j - v_j w_0 - \sum_{i=1}^{j-1} v_i w_{j-i} \equiv 0 \pmod b,$

because $v_j$ is chosen so that $(u_j - r) w' \equiv v_j \pmod b$. Iterating this process, we obtain

$b^n v \equiv u \pmod w.$

The correctness follows by induction on $j$; the base case $j = 0$ holds by the definition of $w'$, and the inductive step preserves congruence modulo $b^{j+1}$.

Part (b)

Let $u, v$ be $n$-place integers with $|u|, |v| < w$. We want $t$ such that

$b^n t \equiv uv \pmod w,\quad |t| < w.$

Algorithm M' (Montgomery multiplication).

Compute the product $uv$ using Algorithm M: $(u_{n-1}\ldots u_0) \times (v_{n-1}\ldots v_0) = p = (p_{2n-1} \ldots p_0)_b$.
Apply Algorithm R to $p$ and $w$, obtaining $t$ such that

$b^n t \equiv p \equiv uv \pmod w.$

If $t \ge w$, subtract $w$ to reduce $t$ into the range $0 \le t < w$; if $t < 0$, add $w$. Then $|t| < w$.

Justification. By construction, $b^n t \equiv uv \pmod w$, and step 3 ensures the canonical reduction to a representative modulo $w$. The algorithm uses only right-to-left digit operations, avoiding repeated division by $w$ on the full product.

Part (c)

The algorithms in (a) and (b) facilitate modular arithmetic as follows:

They replace the standard left-to-right division procedure with a right-to-left scheme, avoiding expensive multi-digit trial quotients.
Intermediate computations remain within $n$-place integers plus a small carry, so memory and computation are reduced.
Repeated modular multiplications, as in exponentiation or cryptographic applications, can be performed without explicit division by $w$ at each step, significantly accelerating high-precision modular arithmetic.

This completes the solution.

Verification

In part (a), for $j=0$, we have $v_0 \equiv (u_0 - r) w' \equiv u_0 w' \equiv 1 \pmod b$, so the congruence holds modulo $b$.
Assume $b^j$ divides $u - b^n \sum_{i=0}^{j-1} v_i b^i w$. Step $j$ maintains divisibility modulo $b^{j+1}$, establishing $b^n v \equiv u \pmod w$ by induction.
In part (b), $b^n t \equiv uv \pmod w$ by combining the standard product with the right-to-left reduction; range adjustment ensures $|t| < w$.
Step-by-step, no operations exceed $n$-place arithmetic plus a one-place carry, confirming efficiency and correctness.

∎

Notes

This method is commonly called Montgomery reduction. It is particularly useful when $\gcd(b,w) = 1$, allowing $w'$ to exist.
The algorithm generalizes to signed integers $u, v$ with $|u|, |v| < w$; care must be taken in step 3 to reduce $t$ into $(-w,w)$.
Right-to-left scanning is compatible with standard multiple-precision addition and multiplication routines of Section 4.3.1.