TAOCP 4.6.4 Exercise 56

Let Q_k(x_1,\ldots,x_n)=\sum_{i=1}^n\sum_{j=1}^n \tau_{ijk}x_i x_j, \qquad 1\le k\le s, be a family of quadratic forms.

Section 4.6.4: Evaluation of Polynomials

Exercise 56. [M32] (V. Strassen.) Show that any polynomial chain that evaluates a set of quadratic forms $\sum_{j=1}^{n} \tau_{ijk} x_i x_j$ for $1 \le k \le s$ must use at least $\frac{1}{2} \text{rank}(\tau_{ijk} + \tau_{jik})$ chain multiplications altogether. [Hint: Show that the minimum number of chain multiplications is the minimum rank of $(t_{ijk})$ taken over all tensors $(t_{ijk})$ such that $t_{ijk} + t_{jik} = \tau_{ijk} + \tau_{jik}$ for all $i, j, k$.] Conclude that if a polynomial chain evaluates a set of bilinear forms (47) corresponding to a tensor $(t_{ijk})$, whether normal or abnormal, it must use at least $\frac{1}{2} \text{rank}(t_{ijk})$ chain multiplications.

Verified: yes
Solve time: 1m41s


Setup

Let

$$ Q_k(x_1,\ldots,x_n)=\sum_{i=1}^n\sum_{j=1}^n \tau_{ijk}x_i x_j, \qquad 1\le k\le s, $$

be a family of quadratic forms. A polynomial chain evaluates these forms by using additions, scalar multiplications, and chain multiplications. Let $r$ be the number of chain multiplications used.

The problem is to prove that every such chain must satisfy

$$ r\ge \frac12,\operatorname{rank}(\tau_{ijk}+\tau_{jik}). $$

The hint suggests proving a more precise statement: The minimum possible number of chain multiplications equals

$$ \min \operatorname{rank}(t_{ijk}), $$

where the minimum is taken over all tensors $(t_{ijk})$ satisfying

$$ t_{ijk}+t_{jik} = \tau_{ijk}+\tau_{jik} \qquad (1\le i,j\le n,\ 1\le k\le s). \tag{1} $$

Afterward we must deduce that any chain evaluating bilinear forms corresponding to a tensor $(t_{ijk})$ uses at least

$$ \frac12,\operatorname{rank}(t_{ijk}) $$

chain multiplications.

Solution

Consider a polynomial chain that evaluates the quadratic forms $Q_k$. Let its $r$ chain multiplications be

$$ L_\nu(x),M_\nu(x), \qquad 1\le \nu\le r, $$

where each $L_\nu$ and $M_\nu$ is a linear form in the variables $x_1,\ldots,x_n$.

Since every quantity produced after the multiplications is obtained only by linear operations, each output $Q_k$ is a linear combination of these products:

$$ Q_k=\sum_{\nu=1}^{r} c_{\nu k}L_\nu M_\nu . \tag{2} $$

Write

$$ L_\nu=\sum_i a_{\nu i}x_i, \qquad M_\nu=\sum_j b_{\nu j}x_j . \tag{3} $$

Then

$$ L_\nu M_\nu = \sum_{i,j} a_{\nu i}b_{\nu j}x_i x_j. \tag{4} $$

Substituting (4) into (2),

$$ Q_k = \sum_{i,j} \left( \sum_{\nu=1}^{r} a_{\nu i}b_{\nu j}c_{\nu k} \right) x_i x_j . \tag{5} $$

Define

$$ t_{ijk} = \sum_{\nu=1}^{r} a_{\nu i}b_{\nu j}c_{\nu k}. \tag{6} $$

Equation (6) expresses $(t_{ijk})$ as a sum of $r$ simple tensors, hence

$$ \operatorname{rank}(t_{ijk})\le r. \tag{7} $$

Because the monomials $x_i x_j$ and $x_j x_i$ are identical,

$$ Q_k = \sum_{i,j} t_{ijk}x_i x_j = \sum_{i,j}\tau_{ijk}x_i x_j $$

implies

$$ t_{ijk}+t_{jik} = \tau_{ijk}+\tau_{jik} \tag{8} $$

for all $i,j,k$. Therefore every chain with $r$ multiplications determines a tensor $(t_{ijk})$ satisfying (1) and (7). Consequently

$$ r\ge \min \operatorname{rank}(t_{ijk}), \tag{9} $$

the minimum extending over all tensors satisfying (1).

To prove the converse, suppose a tensor $(t_{ijk})$ satisfying (1) has rank $R$. Then

$$ t_{ijk} = \sum_{\nu=1}^{R} a_{\nu i}b_{\nu j}c_{\nu k}. \tag{10} $$

Using $R$ chain multiplications, compute

$$ P_\nu=L_\nu M_\nu, \qquad L_\nu=\sum_i a_{\nu i}x_i, \qquad M_\nu=\sum_j b_{\nu j}x_j. $$

Form

$$ \widetilde Q_k = \sum_{\nu=1}^{R} c_{\nu k}P_\nu. $$

By (10),

$$ \widetilde Q_k = \sum_{i,j} t_{ijk}x_i x_j. \tag{11} $$

Condition (1) shows that the symmetric coefficient matrix of $\widetilde Q_k$ equals that of $Q_k$. Hence the two quadratic forms are identical:

$$ \widetilde Q_k=Q_k. $$

Thus $R$ chain multiplications suffice. Combined with (9), this proves that the minimum number of chain multiplications equals

$$ \min \operatorname{rank}(t_{ijk}), $$

over all tensors satisfying (1).

Now let

$$ s_{ijk}=\tau_{ijk}+\tau_{jik}. $$

For every admissible tensor $(t_{ijk})$,

$$ s_{ijk}=t_{ijk}+t_{jik}. \tag{12} $$

The tensor rank is subadditive, therefore

$$ \operatorname{rank}(s_{ijk}) = \operatorname{rank}(t_{ijk}+t_{jik}) \le \operatorname{rank}(t_{ijk}) + \operatorname{rank}(t_{jik}). \tag{13} $$

Since transposition of the first two indices does not change rank,

$$ \operatorname{rank}(t_{jik}) = \operatorname{rank}(t_{ijk}). \tag{14} $$

Equations (13) and (14) give

$$ \operatorname{rank}(s_{ijk}) \le 2,\operatorname{rank}(t_{ijk}), $$

hence

$$ \operatorname{rank}(t_{ijk}) \ge \frac12,\operatorname{rank}(s_{ijk}). \tag{15} $$

Taking the minimum over all admissible tensors and using the characterization already proved,

$$ r_{\min} \ge \frac12,\operatorname{rank}(\tau_{ijk}+\tau_{jik}). $$

This establishes the first assertion.

For bilinear forms corresponding to a tensor $(t_{ijk})$, the tensor is already prescribed. Any chain evaluating those bilinear forms yields, by the same argument as above, a decomposition of $(t_{ijk})$ into at most $r$ simple tensors. Hence

$$ \operatorname{rank}(t_{ijk})\le r. $$

Applying (15) with the prescribed tensor in place of the admissible tensor gives

$$ r \ge \frac12,\operatorname{rank}(t_{ijk}). $$

Therefore every polynomial chain, normal or abnormal, that evaluates the bilinear forms associated with $(t_{ijk})$ must use at least

$$ \frac12,\operatorname{rank}(t_{ijk}) $$

chain multiplications.

This completes the proof.

Verification

From a chain with $r$ multiplications, equations (2)-(6) construct a tensor $(t_{ijk})$ of rank at most $r$ satisfying the symmetry condition (1).

Conversely, any rank-$R$ decomposition (10) immediately yields a chain with exactly $R$ multiplications that evaluates the same quadratic forms.

Hence the minimum number of chain multiplications is exactly the minimum rank among tensors satisfying (1).

Finally,

$$ \tau_{ijk}+\tau_{jik} = t_{ijk}+t_{jik} $$

and tensor-rank subadditivity imply

$$ \operatorname{rank}(\tau_{ijk}+\tau_{jik}) \le 2,\operatorname{rank}(t_{ijk}), $$

which yields the lower bound

$$ r_{\min} \ge \frac12,\operatorname{rank}(\tau_{ijk}+\tau_{jik}). $$

The same inequality applied to a fixed bilinear tensor $(t_{ijk})$ gives

$$ r\ge \frac12,\operatorname{rank}(t_{ijk}). $$

Notes

The argument is the quadratic-form analogue of the tensor-rank method used earlier in Section 4.6.4 for bilinear computations. A chain multiplication produces one simple tensor, and tensor rank measures the minimum number of such simple tensors required. The factor $\tfrac12$ arises because a quadratic form depends only on the symmetrized tensor $t_{ijk}+t_{jik}$.