TAOCP 4.6.1 Exercise 22
Let $f_0(x),f_1(x),\ldots,f_m(x)$ be the Sturm sequence associated with a squarefree polynomial $f_0(x)$, defined by $f_{i-1}(x)=q_i(x)f_i(x)-f_{i+1}(x)\qquad(1\le i<m),$ where $f_m(x)$ is a nonzero c...
Section 4.6.1: Division of Polynomials
Exercise 22. [M23] Prove Sturm's theorem. [Hint: Some sign sequences are impossible.]
Verified: yes
Solve time: 2m05s
Solution
Let
$f_0(x),f_1(x),\ldots,f_m(x)$
be the Sturm sequence associated with a squarefree polynomial $f_0(x)$, defined by
$f_{i-1}(x)=q_i(x)f_i(x)-f_{i+1}(x)\qquad(1\le i<m),$
where $f_m(x)$ is a nonzero constant and where each $f_{i+1}(x)$ is the negative of the remainder obtained when $f_{i-1}(x)$ is divided by $f_i(x)$.
For any real number $t$ that is not a zero of any $f_i$, let
$V(t)$
denote the number of sign changes in the sequence
$f_0(t),f_1(t),\ldots,f_m(t).$
Sturm's theorem states that if $a<b$ and neither endpoint is a zero of $f_0$, then the number of distinct real zeros of $f_0(x)$ in $(a,b)$ is
$V(a)-V(b).$
The proof rests on determining how $V(t)$ changes as $t$ varies.
Consider first a point $\alpha$ that is not a zero of any member of the Sturm sequence. Since all $f_i$ are continuous, the signs of all $f_i(t)$ remain unchanged in a sufficiently small neighborhood of $\alpha$. Hence $V(t)$ is locally constant there.
Next suppose that $\alpha$ is a zero of some $f_i$ with $0<i<m$. Since
$f_{i-1}=q_i f_i-f_{i+1},$
substitution of $x=\alpha$ gives
$f_{i-1}(\alpha)=-f_{i+1}(\alpha).$
Thus neither $f_{i-1}(\alpha)$ nor $f_{i+1}(\alpha)$ is zero, because otherwise $f_i$ and $f_{i-1}$ would have a common divisor and the Euclidean algorithm would not terminate with a nonzero constant. Therefore the neighboring terms have opposite signs.
Near $\alpha$, the relevant portion of the sign sequence has the form
$\cdots,\sigma,\pm,!-\sigma,\cdots,$
where $\sigma=\operatorname{sgn}(f_{i-1}(\alpha))$ and the middle sign is that of $f_i(t)$.
If $f_i(t)$ is positive, the pattern is
$\sigma,+,-\sigma;$
if $f_i(t)$ is negative, the pattern is
$\sigma,-,-\sigma.$
The numbers of sign changes in these two patterns are both equal to $1$. Hence $V(t)$ is unchanged when $t$ passes through a zero of an intermediate polynomial $f_i$.
The remaining possibility is that $\alpha$ is a zero of $f_0$. Since $f_0$ is squarefree, $f_1=f_0'$ does not vanish at $\alpha$. Therefore $f_0(t)$ changes sign when $t$ passes through $\alpha$, while $f_1(t)$ keeps a fixed nonzero sign.
The initial portion of the sign sequence therefore changes from either
$+,+,\ldots$
to
$-,+,\ldots,$
or from
$+,-,\ldots$
to
$-,-,\ldots.$
In the first case the number of sign changes increases by $1$ when $t$ moves from left to right through $\alpha$; in the second case it decreases by $1$. To determine which occurs, observe that a simple root of $f_0$ satisfies
$f_0(t)=f_1(\alpha)(t-\alpha)+O((t-\alpha)^2).$
Hence
$$= \operatorname{sgn}(f_1(\alpha)),\operatorname{sgn}(t-\alpha)$$
for $t$ sufficiently close to $\alpha$.
If $f_1(\alpha)>0$, the signs are
$(-,+)\quad\text{for }t<\alpha,\qquad(+,+)\quad\text{for }t>\alpha;$
the number of sign changes decreases from $1$ to $0$.
If $f_1(\alpha)<0$, the signs are
$(+,-)\quad\text{for }t<\alpha,\qquad(-,-)\quad\text{for }t>\alpha;$
again the number of sign changes decreases from $1$ to $0$.
Thus every real root of $f_0$ causes $V(t)$ to decrease by exactly $1$ as $t$ passes from left to right.
We have proved:
- $V(t)$ is constant except at zeros of $f_0$.
- Passing through a zero of some $f_i$ with $i>0$ leaves $V(t)$ unchanged.
- Passing through a simple zero of $f_0$ decreases $V(t)$ by exactly $1$.
As $t$ moves from $a$ to $b$, the total decrease of $V(t)$ is therefore precisely the number of distinct real zeros of $f_0$ in $(a,b)$. Consequently,
$$ V(a)-V(b) = #{\text{real zeros of }f_0(x)\text{ in }(a,b)}. $$
This is Sturm's theorem.
This completes the proof.
∎
Notes
The hint refers to the fact that at a zero of an intermediate polynomial $f_i$, the neighboring values satisfy
$f_{i-1}(\alpha)=-f_{i+1}(\alpha),$
so sign patterns such as
$+,0,+ \qquad\text{or}\qquad -,0,-$
cannot occur. Only the patterns
$+,0,- \qquad\text{or}\qquad -,0,+$
are possible, and these leave the variation count unchanged.