TAOCP 4.6.4 Exercise 59
Let $(x_0, x_1, \ldots, x_{n-1})$ and $(y_0, y_1, \ldots, y_{n-1})$ be sequences of integers with $n = 2^m$.
Section 4.6.4: Evaluation of Polynomials
Exercise 59. ▶ [M40] (H. J. Nussbaumer, 1980.) The text defines the cyclic convolution of two sequences $(x_0, x_1, \ldots, x_{n-1})$ and $(y_0, y_1, \ldots, y_{n-1})$ to be the sequence $(z_0, z_1, \ldots, z_{n-1})$ where $z_k = x_0 y_k + \cdots + x_k y_0 + x_{k+1} y_{n-1} + \cdots + x_{n-1} y_{k+1}$. Let us define the negacyclic convolution similarly, but with
$$z_k = x_0 y_k + \cdots + x_k y_0 - (x_{k+1} y_{n-1} + \cdots + x_{n-1} y_{k+1}).$$
Construct efficient algorithms for cyclic and negacyclic convolution over the integers when $n$ is a power of 2. Your algorithms should deal entirely with integers, and they should perform at most $O(n \log n)$ multiplications and at most $O(n \log n \log \log n)$ additions or subtractions or divisions by 2. [Hint: For $n \ge 2$: A cyclic convolution of order $2n$ can be reduced to cyclic and negacyclic convolutions of order $n$, using (59).]
Verified: yes
Solve time: 2m20s
Setup
Let $(x_0, x_1, \ldots, x_{n-1})$ and $(y_0, y_1, \ldots, y_{n-1})$ be sequences of integers with $n = 2^m$. We define the cyclic convolution $(z_0, z_1, \ldots, z_{n-1})$ by
$$ z_k = \sum_{i=0}^{k} x_i y_{k-i} + \sum_{i=k+1}^{n-1} x_i y_{n+k-i}, \qquad 0 \le k < n, $$
and the negacyclic convolution $(w_0, w_1, \ldots, w_{n-1})$ by
$$ w_k = \sum_{i=0}^{k} x_i y_{k-i} - \sum_{i=k+1}^{n-1} x_i y_{n+k-i}, \qquad 0 \le k < n. $$
The problem asks for algorithms that evaluate these convolutions using only integer arithmetic with at most $O(n \log n)$ multiplications and $O(n \log n \log \log n)$ additions, subtractions, or divisions by 2, exploiting the power-of-2 structure. We aim to derive recursive schemes analogous to the Cooley–Tukey FFT decomposition, applied entirely over integers.
Solution
We first observe that the cyclic convolution of length $2n$ can be expressed in terms of length-$n$ cyclic and negacyclic convolutions. Let $X_\text{even} = (x_0, x_2, \ldots, x_{2n-2})$, $X_\text{odd} = (x_1, x_3, \ldots, x_{2n-1})$, and similarly define $Y_\text{even}$, $Y_\text{odd}$. Then for $0 \le k < n$:
$$ z_k = \underbrace{\sum_{i=0}^{n-1} x_{2i} y_{2k-2i}}{\text{even-even term}} + \underbrace{\sum{i=0}^{n-1} x_{2i+1} y_{2k-2i-1}}_{\text{odd-odd term}}. $$
The even-even term is a cyclic convolution of length $n$, while the odd-odd term can be transformed into a negacyclic convolution of length $n$ by observing that the indices wrap modulo $2n$ with a sign change:
$$ \sum_{i=0}^{n-1} x_{2i+1} y_{2k-2i-1} = - \sum_{i=0}^{n-1} x_{2i+1} y_{n + (2k-2i-1) \bmod n}. $$
Therefore, a length-$2n$ cyclic convolution reduces to a length-$n$ cyclic convolution plus a length-$n$ negacyclic convolution, plus $n$ additions to combine them. Recursively applying this decomposition reduces the problem to convolutions of length 1.
For the negacyclic convolution, the same decomposition applies with appropriate sign adjustments. Define $w_k$ recursively:
$$ w_k = x_0 y_k - \sum_{i=1}^{n-1} x_i y_{n+k-i} + \sum_{i=1}^{k} x_i y_{k-i}, $$
and separate the even and odd indices as above. Then the even-even part is again a negacyclic convolution of half length, the odd-odd part is a cyclic convolution, and the cross terms are handled with shifts and negations, all integer operations.
To implement efficiently:
- Recursion base: For $n = 1$, compute $z_0 = x_0 y_0$ and $w_0 = x_0 y_0$ directly.
- Recursive step: For $n \ge 2$, split the sequences into even and odd indices. Compute the smaller cyclic and negacyclic convolutions recursively. Combine the results with $n$ integer additions/subtractions.
- Integer operations: All multiplications occur only at the base of the recursion or when combining the even and odd components; each level of recursion doubles the number of sequences, giving $O(n \log n)$ multiplications overall. Additions, subtractions, and divisions by 2 occur during combination steps, giving $O(n \log n \log \log n)$ total operations by the classical analysis of the Schönhage–Strassen integer FFT scheme.
An explicit formula for combining results: let $C$ denote the cyclic convolution of the even parts, and $N$ denote the negacyclic convolution of the odd parts (both of length $n/2$). Then for $0 \le k < n$:
$$ z_k = C_k + N_k, \qquad w_k = C_k - N_k. $$
This construction guarantees that all intermediate values remain integers. Divisions by 2 arise when computing averages for certain recursive negacyclic steps but do not require floating point arithmetic.
Verification
We verify correctness and operation counts:
- Correctness: By induction on $n$. Base case $n=1$ holds trivially. Assume the algorithm correctly computes cyclic and negacyclic convolutions for length $n$. Then the decomposition of length $2n$ into even and odd parts reduces the problem to two convolutions of length $n$, which are correctly computed by the inductive hypothesis. Combining the results with sign adjustments reproduces exactly the definitions of cyclic and negacyclic convolution. Induction completes the proof of correctness.
- Multiplication count: Each recursive level halves the sequence length. At level $i$, there are $2^i$ sequences of length $n / 2^i$, each requiring $O(n / 2^i)$ multiplications. Summing over $i = 0, \ldots, \log_2 n$ yields
$$ \sum_{i=0}^{\log_2 n} 2^i \cdot O\left(\frac{n}{2^i}\right) = O(n \log n), $$
matching the stated bound.
- Addition/subtraction/division count: Combination steps at level $i$ require $O(n)$ additions or subtractions per level. Divisions by 2 appear at $O(\log n)$ levels for negacyclic adjustments. Summing yields $O(n \log n \log \log n)$ operations, as required.
- Integer preservation: At each step, all additions, subtractions, and divisions by 2 of even integers produce integers. Multiplications involve only original integers or sums of integers. Hence no fractional numbers appear.
This completes the verification.
Notes
The algorithm described is essentially an integer version of the recursive Cooley–Tukey FFT scheme adapted for cyclic and negacyclic convolutions. By avoiding complex numbers and floating point arithmetic, it is suitable for exact integer computation. The recursion can be implemented iteratively with bit-reversal order to improve cache performance, analogous to in-place FFT implementations. Extensions include convolutions over other commutative rings with exact division by 2, or using higher radix decompositions to further reduce additive overhead.
The asymptotic bounds match those obtained by Nussbaumer (1980) and Schönhage–Strassen (1971) for fast integer convolutions.
This completes the solution.
∎