TAOCP 4.6.4 Exercise 20
Let P(x)=u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5.
Section 4.6.4: Evaluation of Polynomials
Exercise 20. ▶ [21] Write a MIX program that evaluates a fifth-degree polynomial according to scheme (11); try to make the program as efficient as possible, by making slight modifications to (11). Use MIX's floating point arithmetic operators FADD and FMUL, which are described in Section 4.2.1.
Verified: yes
Solve time: 4m11s
Solution
Let
$$ P(x)=u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5. $$
Scheme (11) in §4.6.4 is Horner's rule:
$$ P(x) = \bigl(\bigl(\bigl(\bigl(u_0x+u_1\bigr)x+u_2\bigr)x+u_3\bigr)x+u_4\bigr)x+u_5. $$
A direct implementation requires five multiplications by $x$ and five additions.
The exercise asks for a fifth-degree polynomial and suggests making slight modifications to (11) in order to improve efficiency. Since the degree is fixed, we can group two Horner steps at a time by introducing
$$ y=x^2. $$
Then
$$ \begin{aligned} P(x) &= u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5 \ &= x(u_0y^2+u_2y+u_4) +(u_1y^2+u_3y+u_5). \end{aligned} $$
Define
$$ A(y)=u_0y^2+u_2y+u_4, \qquad B(y)=u_1y^2+u_3y+u_5. $$
Both $A$ and $B$ are quadratic polynomials in $y$, hence
$$ A(y)=(u_0y+u_2)y+u_4, \qquad B(y)=(u_1y+u_3)y+u_5. $$
The evaluation can therefore proceed as follows:
- Compute $y=x^2$.
- Evaluate $A(y)$.
- Evaluate $B(y)$.
- Form $xA(y)+B(y)$.
This uses
$$ 1+2+2+1=6 $$
floating multiplications and
$$ 2+2+1=5 $$
floating additions.
The number of additions is unchanged, but the evaluation of the two quadratics is independent and allows better use of registers and memory. In MIX this leads to a compact program.
Assume:
XVALcontains $x$,U0,U1,U2,U3,U4,U5contain the coefficients,Y,A,Bare temporary storage locations.
All quantities are floating-point numbers.
LDA XVAL
FMUL XVAL * y := x^2
STA Y
* A = (u0*y + u2)*y + u4
LDA U0
FMUL Y
FADD U2
FMUL Y
FADD U4
STA A
* B = (u1*y + u3)*y + u5
LDA U1
FMUL Y
FADD U3
FMUL Y
FADD U5
STA B
* P(x) = x*A + B
LDA A
FMUL XVAL
FADD B
HLT
Every FMUL and FADD has a memory operand, exactly as required by the MIX floating-point instruction set.
To verify correctness, let the value stored in A be
$$ A=(u_0y+u_2)y+u_4=u_0y^2+u_2y+u_4, $$
and the value stored in B be
$$ B=(u_1y+u_3)y+u_5=u_1y^2+u_3y+u_5. $$
The final accumulator value is
$$ xA+B = x(u_0y^2+u_2y+u_4) +(u_1y^2+u_3y+u_5). $$
Substituting $y=x^2$ gives
$$ u_0x^5+u_1x^4+u_2x^3+u_3x^2+u_4x+u_5 = P(x). $$
Thus the program correctly evaluates the fifth-degree polynomial.
The slight modification of Horner's scheme consists of treating the polynomial as a linear polynomial in $x$ whose coefficients are quadratic polynomials in $x^2$. This yields an efficient MIX floating-point program using only valid FMUL and FADD instructions with memory operands. ∎