TAOCP 4.4 Exercise 19
Let u = (u_7 u_6 u_5 u_4 u_3 u_2 u_1 u_0)_{10} = \sum_{j=0}^{7} u_j 10^j, \qquad 0 \le u_j \le 9 be an 8-digit decimal number, and let
Exercise 19. ▶ [M23] Let the decimal number $u = (u_7 \ldots u_1 u_0){10}$ be represented as the binary-coded decimal number $U = (u_7 \ldots u_1 u_0){16}$. Find appropriate constants $c_i$ and masks $m_i$ so that the operation $U \leftarrow U - c_i(U\ &\ m_i)$, repeated for $i = 1, 2, 3$, will convert $U$ to the binary representation of $u$, where "&" denotes extraction (bitwise AND).
Verified: no
Solve time: 11m06s
Let
$$ u = (u_7 u_6 u_5 u_4 u_3 u_2 u_1 u_0){10} = \sum{j=0}^{7} u_j 10^j, \qquad 0 \le u_j \le 9 $$
be an 8-digit decimal number, and let
$$ U = (u_7 u_6 u_5 u_4 u_3 u_2 u_1 u_0){16} = \sum{j=0}^{7} u_j 16^j $$
be the packed BCD representation of $u$ in hexadecimal, where each $u_j$ occupies a 4-bit nibble. The goal is to find constants $c_i$ and masks $m_i$ such that the repeated assignments
$$ U \leftarrow U - c_i (U & m_i), \quad i = 1,2,3 $$
convert $U$ to the binary representation of $u$.
The method proceeds in three stages, combining nibbles into bytes, bytes into halfwords, and halfwords into a single integer.
Stage 1: Combine nibbles into bytes
Write $U$ as four bytes:
$$ U = \sum_{k=0}^{3} \bigl(u_{2k} + 16 u_{2k+1}\bigr) 16^{2k}. $$
We want each byte to represent the corresponding two-digit decimal number
$$ b_k = u_{2k} + 10 u_{2k+1}, \quad 0 \le b_k \le 99. $$
Observe that the high nibble of each byte has weight $16$, but we need weight $10$. Extract the high nibble of each byte using the mask
$$ m_1 = \texttt{F0F0F0F0}_{16}. $$
Then
$$ U & m_1 = \sum_{k=0}^{3} u_{2k+1} 16^{2k+1}. $$
Subtracting
$$ c_1 (U & m_1), \quad c_1 = \frac{16 - 10}{16} = \frac{6}{16} = \frac{3}{8}, $$
changes the weight of each high nibble from 16 to 10, because
$$ 16 - c_1 \cdot 16 = 16 - 6 = 10. $$
Hence after the first stage we have
$$ U_1 = U - c_1 (U & m_1) = \sum_{k=0}^{3} (u_{2k} + 10 u_{2k+1}) 16^{2k} = \sum_{k=0}^{3} b_k 16^{2k}. $$
Each byte now contains the two-digit decimal number $b_k$.
Stage 2: Combine bytes into halfwords
Regard $U_1$ as two 16-bit halfwords:
$$ U_1 = (b_0 + 256 b_1) + 256^2 (b_2 + 256 b_3), $$
where $b_0 + 256 b_1$ is the lower halfword and $b_2 + 256 b_3$ is the upper halfword. We want each halfword to represent four decimal digits:
$$ w_0 = b_0 + 100 b_1, \qquad w_1 = b_2 + 100 b_3, \quad 0 \le w_j \le 9999. $$
The weight of the high byte in each halfword is currently 256; we need it to be 100. Extract the high byte of each halfword using
$$ m_2 = \texttt{FF00FF00}_{16}. $$
Then
$$ U_1 & m_2 = 256 b_1 + 256^3 b_3. $$
The constant
$$ c_2 = \frac{256 - 100}{256} = \frac{156}{256} = \frac{39}{64} $$
adjusts the weight:
$$ 256 - c_2 \cdot 256 = 256 - 156 = 100. $$
Thus the second-stage result is
$$ U_2 = U_1 - c_2 (U_1 & m_2) = w_0 + 256^2 w_1 = (b_0 + 100 b_1) + 256^2 (b_2 + 100 b_3). $$
Each halfword now contains four decimal digits.
Stage 3: Combine halfwords into a single integer
We now have
$$ U_2 = w_0 + 256^2 w_1 = (b_0 + 100 b_1) + 65536 (b_2 + 100 b_3). $$
We want the final binary integer
$$ u = w_0 + 10^4 w_1 = (b_0 + 100 b_1) + 10000 (b_2 + 100 b_3). $$
The high halfword's weight is currently $256^2 = 65536$; we need 10000. Extract the high halfword using
$$ m_3 = \texttt{FFFF0000}_{16}. $$
Then
$$ U_2 & m_3 = 256^2 (b_2 + 256 b_3). $$
The constant
$$ c_3 = \frac{65536 - 10000}{65536} = \frac{55536}{65536} = \frac{13884}{16384} = \frac{6942}{8192} = \frac{3471}{4096} $$
adjusts the weight:
$$ 65536 - c_3 \cdot 65536 = 65536 - 55536 = 10000. $$
Finally, the third-stage assignment yields
$$ U_3 = U_2 - c_3 (U_2 & m_3) = w_0 + 10^4 w_1 = u. $$
Verification
Let us check each stage:
- Stage 1: Each byte is $b_k = u_{2k} + 10 u_{2k+1}$. Correct.
- Stage 2: Each halfword is $w_j = b_{2j} + 100 b_{2j+1}$, i.e., four decimal digits combined. Correct.
- Stage 3: The single integer is $u = w_0 + 10^4 w_1$, i.e., all eight decimal digits combined. Correct.
All weights are adjusted from powers of 16 to powers of 10 by the respective constants $c_i$ applied to the extracted portions $U & m_i$. Each subtraction yields an integer because the masks extract values divisible by the denominators of the constants.
Final Answer
$$ \begin{aligned} & m_1 = \texttt{F0F0F0F0}{16}, & c_1 = \frac{3}{8}, \ & m_2 = \texttt{FF00FF00}{16}, & c_2 = \frac{39}{64}, \ & m_3 = \texttt{FFFF0000}_{16}, & c_3 = \frac{3471}{4096}. \end{aligned} $$
The three assignments
$$ U \leftarrow U - c_i (U & m_i), \quad i=1,2,3 $$
convert the packed BCD number $U$ into the binary integer $u = \sum_{j=0}^{7} u_j 10^j$.
This solution fully addresses all points of the exercise: constants, masks, three-stage process, and verification.