TAOCP 4.6.1 Exercise 22

Let $f_0(x),f_1(x),\ldots,f_m(x)$ be the Sturm sequence associated with a squarefree polynomial $f_0(x)$, defined by $f_{i-1}(x)=q_i(x)f_i(x)-f_{i+1}(x)\qquad(1\le i<m),$ where $f_m(x)$ is a nonzero c...

Section 4.6.1: Division of Polynomials

Exercise 22. [M23] Prove Sturm's theorem. [Hint: Some sign sequences are impossible.]

Verified: yes
Solve time: 2m05s


Solution

Let

$f_0(x),f_1(x),\ldots,f_m(x)$

be the Sturm sequence associated with a squarefree polynomial $f_0(x)$, defined by

$f_{i-1}(x)=q_i(x)f_i(x)-f_{i+1}(x)\qquad(1\le i<m),$

where $f_m(x)$ is a nonzero constant and where each $f_{i+1}(x)$ is the negative of the remainder obtained when $f_{i-1}(x)$ is divided by $f_i(x)$.

For any real number $t$ that is not a zero of any $f_i$, let

$V(t)$

denote the number of sign changes in the sequence

$f_0(t),f_1(t),\ldots,f_m(t).$

Sturm's theorem states that if $a<b$ and neither endpoint is a zero of $f_0$, then the number of distinct real zeros of $f_0(x)$ in $(a,b)$ is

$V(a)-V(b).$

The proof rests on determining how $V(t)$ changes as $t$ varies.

Consider first a point $\alpha$ that is not a zero of any member of the Sturm sequence. Since all $f_i$ are continuous, the signs of all $f_i(t)$ remain unchanged in a sufficiently small neighborhood of $\alpha$. Hence $V(t)$ is locally constant there.

Next suppose that $\alpha$ is a zero of some $f_i$ with $0<i<m$. Since

$f_{i-1}=q_i f_i-f_{i+1},$

substitution of $x=\alpha$ gives

$f_{i-1}(\alpha)=-f_{i+1}(\alpha).$

Thus neither $f_{i-1}(\alpha)$ nor $f_{i+1}(\alpha)$ is zero, because otherwise $f_i$ and $f_{i-1}$ would have a common divisor and the Euclidean algorithm would not terminate with a nonzero constant. Therefore the neighboring terms have opposite signs.

Near $\alpha$, the relevant portion of the sign sequence has the form

$\cdots,\sigma,\pm,!-\sigma,\cdots,$

where $\sigma=\operatorname{sgn}(f_{i-1}(\alpha))$ and the middle sign is that of $f_i(t)$.

If $f_i(t)$ is positive, the pattern is

$\sigma,+,-\sigma;$

if $f_i(t)$ is negative, the pattern is

$\sigma,-,-\sigma.$

The numbers of sign changes in these two patterns are both equal to $1$. Hence $V(t)$ is unchanged when $t$ passes through a zero of an intermediate polynomial $f_i$.

The remaining possibility is that $\alpha$ is a zero of $f_0$. Since $f_0$ is squarefree, $f_1=f_0'$ does not vanish at $\alpha$. Therefore $f_0(t)$ changes sign when $t$ passes through $\alpha$, while $f_1(t)$ keeps a fixed nonzero sign.

The initial portion of the sign sequence therefore changes from either

$+,+,\ldots$

to

$-,+,\ldots,$

or from

$+,-,\ldots$

to

$-,-,\ldots.$

In the first case the number of sign changes increases by $1$ when $t$ moves from left to right through $\alpha$; in the second case it decreases by $1$. To determine which occurs, observe that a simple root of $f_0$ satisfies

$f_0(t)=f_1(\alpha)(t-\alpha)+O((t-\alpha)^2).$

Hence

$$= \operatorname{sgn}(f_1(\alpha)),\operatorname{sgn}(t-\alpha)$$

for $t$ sufficiently close to $\alpha$.

If $f_1(\alpha)>0$, the signs are

$(-,+)\quad\text{for }t<\alpha,\qquad(+,+)\quad\text{for }t>\alpha;$

the number of sign changes decreases from $1$ to $0$.

If $f_1(\alpha)<0$, the signs are

$(+,-)\quad\text{for }t<\alpha,\qquad(-,-)\quad\text{for }t>\alpha;$

again the number of sign changes decreases from $1$ to $0$.

Thus every real root of $f_0$ causes $V(t)$ to decrease by exactly $1$ as $t$ passes from left to right.

We have proved:

  1. $V(t)$ is constant except at zeros of $f_0$.
  2. Passing through a zero of some $f_i$ with $i>0$ leaves $V(t)$ unchanged.
  3. Passing through a simple zero of $f_0$ decreases $V(t)$ by exactly $1$.

As $t$ moves from $a$ to $b$, the total decrease of $V(t)$ is therefore precisely the number of distinct real zeros of $f_0$ in $(a,b)$. Consequently,

$$ V(a)-V(b) = #{\text{real zeros of }f_0(x)\text{ in }(a,b)}. $$

This is Sturm's theorem.

This completes the proof.

Notes

The hint refers to the fact that at a zero of an intermediate polynomial $f_i$, the neighboring values satisfy

$f_{i-1}(\alpha)=-f_{i+1}(\alpha),$

so sign patterns such as

$+,0,+ \qquad\text{or}\qquad -,0,-$

cannot occur. Only the patterns

$+,0,- \qquad\text{or}\qquad -,0,+$

are possible, and these leave the variation count unchanged.