TAOCP 4.3.1 Exercise 41
Let $b$ be the radix, $w = (w_{n-1} \ldots w_1 w_0)_b$ an $n$-place integer with $\gcd(w,b) = 1$, and $w'$ a one-place integer satisfying $u_0 w' \equiv 1 \pmod b.$ We are asked to: a) Given $u = \pm(...
Section 4.3.1: The Classical Algorithms
Exercise 41. ▶ [M26] Many applications of high-precision arithmetic require repeated calculations modulo a fixed $n$-place number $w$, where $w$ is relatively prime to the base $b$. We can speed up such calculations by using a trick due to Peter L. Montgomery [Math. Comp. 44 (1985), 519–521], which streamlines the remaindering process by essentially working from right to left instead of left to right.
a) Given $u = \pm(u_{m+n-1} \ldots u_1 u_0)b$, $w = (w{n-1} \ldots w_1 w_0)b$, and a number $w'$ such that $u_0 w' \bmod b = 1$, show how to compute $v = \pm(v{n-1} \ldots v_1 v_0)_b$ such that $b^n v \equiv uw \pmod{w}$.
b) Given $n$-place signed integers $u$, $v$, with $|u|, |v| < w$, and given $w'$ as in (a), show how to calculate an $n$-place integer $t$ such that $|t| < w$ and $b^n t \equiv uv \pmod{w}$.
c) How do the algorithms of (a) and (b) facilitate arithmetic mod $w$?
Verified: yes
Solve time: 8m14s
Exercise 4.3.1.41 [M26] Solution
Setup
Let $b$ be the radix, $w = (w_{n-1} \ldots w_1 w_0)_b$ an $n$-place integer with $\gcd(w,b) = 1$, and $w'$ a one-place integer satisfying
$u_0 w' \equiv 1 \pmod b.$
We are asked to:
a) Given $u = \pm(u_{m+n-1} \ldots u_0)b$, compute $v = \pm(v{n-1} \ldots v_0)_b$ such that
$b^n v \equiv u \pmod w.$
b) Given $u, v$ with $|u|, |v| < w$, compute $t$ such that
$b^n t \equiv uv \pmod w,\quad |t| < w.$
c) Explain how the algorithms in (a) and (b) streamline modular arithmetic.
We shall use the multiple-precision operations of Section 4.3.1, in particular addition, subtraction, and multiplication, with a right-to-left digit scan as in Montgomery's method.
Solution
Part (a)
Let $u = (u_{m+n-1} \ldots u_0)b$ be an $(m+n)$-place integer. We seek $v = (v{n-1} \ldots v_0)_b$ satisfying
$b^n v \equiv u \pmod w.$
Define a carry-like sequence $v_0, \ldots, v_{n-1}$ recursively as follows.
Algorithm R (Right-to-left remaindering modulo $w$).
-
Initialize $r \leftarrow 0$.
-
For $j = 0$ to $n-1$:
-
Compute
$v_j \leftarrow (u_j - r) w' \bmod b.$ 2. Update the carry:
$r \leftarrow \frac{u_j - v_j w_0 - \sum_{i=1}^{j-1} v_i w_{j-i}}{b}.$
This can be implemented efficiently by standard multiple-precision arithmetic, scanning digits of $u$ and $v$ right to left. 3. Return $v = (v_{n-1} \ldots v_0)_b$.
Justification. At each step,
$u_j - v_j w_0 - \sum_{i=1}^{j-1} v_i w_{j-i} \equiv 0 \pmod b,$
because $v_j$ is chosen so that $(u_j - r) w' \equiv v_j \pmod b$. Iterating this process, we obtain
$b^n v \equiv u \pmod w.$
The correctness follows by induction on $j$; the base case $j = 0$ holds by the definition of $w'$, and the inductive step preserves congruence modulo $b^{j+1}$.
Part (b)
Let $u, v$ be $n$-place integers with $|u|, |v| < w$. We want $t$ such that
$b^n t \equiv uv \pmod w,\quad |t| < w.$
Algorithm M' (Montgomery multiplication).
- Compute the product $uv$ using Algorithm M: $(u_{n-1}\ldots u_0) \times (v_{n-1}\ldots v_0) = p = (p_{2n-1} \ldots p_0)_b$.
- Apply Algorithm R to $p$ and $w$, obtaining $t$ such that
$b^n t \equiv p \equiv uv \pmod w.$
- If $t \ge w$, subtract $w$ to reduce $t$ into the range $0 \le t < w$; if $t < 0$, add $w$. Then $|t| < w$.
Justification. By construction, $b^n t \equiv uv \pmod w$, and step 3 ensures the canonical reduction to a representative modulo $w$. The algorithm uses only right-to-left digit operations, avoiding repeated division by $w$ on the full product.
Part (c)
The algorithms in (a) and (b) facilitate modular arithmetic as follows:
- They replace the standard left-to-right division procedure with a right-to-left scheme, avoiding expensive multi-digit trial quotients.
- Intermediate computations remain within $n$-place integers plus a small carry, so memory and computation are reduced.
- Repeated modular multiplications, as in exponentiation or cryptographic applications, can be performed without explicit division by $w$ at each step, significantly accelerating high-precision modular arithmetic.
This completes the solution.
Verification
- In part (a), for $j=0$, we have $v_0 \equiv (u_0 - r) w' \equiv u_0 w' \equiv 1 \pmod b$, so the congruence holds modulo $b$.
- Assume $b^j$ divides $u - b^n \sum_{i=0}^{j-1} v_i b^i w$. Step $j$ maintains divisibility modulo $b^{j+1}$, establishing $b^n v \equiv u \pmod w$ by induction.
- In part (b), $b^n t \equiv uv \pmod w$ by combining the standard product with the right-to-left reduction; range adjustment ensures $|t| < w$.
- Step-by-step, no operations exceed $n$-place arithmetic plus a one-place carry, confirming efficiency and correctness.
∎
Notes
- This method is commonly called Montgomery reduction. It is particularly useful when $\gcd(b,w) = 1$, allowing $w'$ to exist.
- The algorithm generalizes to signed integers $u, v$ with $|u|, |v| < w$; care must be taken in step 3 to reduce $t$ into $(-w,w)$.
- Right-to-left scanning is compatible with standard multiple-precision addition and multiplication routines of Section 4.3.1.