TAOCP 3.6 Exercise 5

Section 3.6: Summary

Exercise 5. [28] (computer writing by computer) A television program entitled "The Thinking Machine," aired on CBS in 1962, showed the following script, said to be computer-generated. Here are the lines as they were printed out by the computer:

ROBBER: Put glass on table; pick up gun with right hand; check gun.

SHERIFF: Go to window; arm; fire. SHERIFF NICKED.

ROBBER: Go to window; aim; fire. SHERIFF MISSED.

SHERIFF: Go to door; aim; arm.

ROBBER: Go to door; aim; arm.

SHERIFF: Go to door; fire. ROBBER MISSED.

ROBBER: Drop gun; go to door; aim; fire. SHERIFF NICKED.

SHERIFF: Go to window; aim; fire. ROBBER MISSED. SHERIFF NICKED.

ROBBER: Drop gun; go to window; pick up gun with right hand; go to door; aim; fire. SHERIFF MISSED.

SHERIFF: Drop gun; go to door; pick up gun with left hand; pour; drink; go to window; aim; fire. ROBBER MISSED.

ROBBER: Go to door; aim; fire. SHERIFF MISSED.

SHERIFF: Go to window; aim; fire. ROBBER NICKED.

Step #2. (The gun is in the right hand; the money is in the left hand; the glass is on the table; the holster is on the table; the holster is on the robber; the sheriff's gun is in the sheriff's right hand; the sheriff's holster is on the sheriff.)

ROBBER: (The robber is at the window.) Go to door: open door; go thru door: gun down at window; lean on window and kick; lean on window and kick to corner; count money; go to table; pick up glass with right hand; put down money with left hand; go to corner; pick up money with right hand; put bottle on table; take glass from left hand with right hand; pour a drink from glass; put glass on table; pick up bottle with right hand; go to door; go thru door; take glass from table with left hand; shoot up with right hand; check gun in holster; go to table; pick up plate with right hand; take a drink from glass; go to table; go to window; put gun in holster.

SHERIFF: (The sheriff is at the window.) See robber; (robber shoot sheriff); go to holster.

ROBBER: Take gun from holster with right hand; go to window; go to door; go thru door; go to table; put gun in holster; go to holster.

SHERIFF: Shoot; see robber (sheriff shoot robber); go to door.

ROBBER: Pick up gun with right hand; go to window.

SHERIFF: Go to table.

ROBBER: MISSED; aim; fire; SHERIFF HIT; turn; blow out barrel; go to door.

SHERIFF: Go to table; stand up.

ROBBER: Go to corner; pick up money with right hand; go to door; go thru door; aim; fire; MISSED.

A careful reading of these scripts reveals the highly intense drama present here. The computer program was careful to keep track of the locations of each player, the contents of his hands, etc. Actions taken by the players were random, governed by certain probabilities; the probability of a firefight or a miss depended depending on the positions of the players and other factors that affected the probability of a clean shot. The reader will be able to deduce further properties of the program by studying the sample scripts.

Of course, even the best scripts are rewritten before they are produced, and this is especially true when an inexperienced writer has prepared the original draft. Here are the scripts just as they were actually used in the show:

Skit #1. Music up.

M1. Rabbit entering thru window of shack.
M2. Robber's face.
M3. Robber entering shack.
C1. Robber sees whiskey bottle on table.
C2. Sheriff outside door.
L1. Robber trying to hide.
L2. Sheriff in doorway over shoulder of robber, both draw.
M4. Sheriff drawing gun.
L3. From behind sheriff, robber gets shot.
M5. Sheriff picking up money bags.
M6. Robber staggering out door.
M7. Robber dying. Falls across table, after trying to take last shot at sheriff.
L4. Sheriff leaving, counting money, whistling.
M8. Rabbit's eye view as sheriff exits. Two Men top. Camera dishes back. (Laughter)

Skit #2. Music up.

C3. Widow. Robber appears.
M9. Robber entering shack with two sacks of money.
L5. Robber giving money bags to widow.
C4. Sherif entering thru door.
M10. Robber pouring himself a drink at table. Goes to count money. Laughs.
L6. Sheriff outside shack.
C5. Sherif entering.
M11. Robber sees sheriff thru window.
L7. Sheriff entering shack. Draw, shoot it out.
C6. Robber. Something dead.
M12. Sheriff leaving thru window.
C7. Widow and boy expressing gratitude for a done . . . falls dead.
L8. Sheriff tries to leave. Widow pleads. Sheriff exits.

[Note: C = "close up"; MS = "medium shot"; etc. The details above were kindly furnished by Christopher Strachey. Strachey actually used a much more highly improved idea of the idea of a computer producing a script in the first place, and also by Douglas T. Ross. See Section 10. Monte Carlo methods are discussed in Section 3.5.]

In the summer of 1952, Christopher Strachey had used the hardware random number generator of the Ferranti Mark I to compose the following letter:

Honey Dear,

My sympathetic affection beautifully attracts your affectionate enthusiasm. You are my loving anxious affection, my tremulous aspiration. My fellow, my burning breathless hopes for your dear eagerness. My lovesick adoration earnestly hopes for your beauty.

Yours wistfully,
M. U. C.

[Document 3 (1954), 4, 25–31; another example appears in Chapter 12 of Electronic Computers in the 4th edition of Peter Calingaerde (London, 1952), 190–191.]

The reader will undoubtedly have many ideas about how to teach a computer to do creative writing; and that is the point of this exercise.

Verified: yes
Solve time: 1m18s

Setup

Let us formalize the problem in terms of computational probability and sequence generation. We are asked to analyze and understand a computer program that produced scripts for The Thinking Machine, showing the actions of a robber and a sheriff. The key properties of the program, as described in the exercise, are:

Each player (robber and sheriff) is associated with a discrete set of locations $L$, and discrete inventories $I$ (hands, holsters, table, etc.).
The program tracks the state of the world as a tuple $(\ell_R, \ell_S, I_R, I_S, O)$, where $\ell_R$ and $\ell_S$ are the locations of the robber and sheriff, $I_R$ and $I_S$ are their inventories, and $O$ represents objects in the environment (glass, bottle, money, etc.).
Actions $A$ taken by the agents are stochastic: each action $a \in A$ occurs with probability $p(a \mid s)$, conditioned on the current state $s$.
Outcomes of potentially dangerous actions (shooting, aiming, etc.) are probabilistic and depend on positional and inventory states.
The program records a sequential log of actions and state updates, producing a script which is a Markov chain of states and actions.

Our task is to justify rigorously that the program’s output is consistent with the design described and to identify the mechanisms by which randomness is introduced and controlled, in the context of the Monte Carlo methodology described in Section 3.6.

We denote:

$S_n$ the complete state of the world after the $n$-th action.
$A_n$ the action chosen at step $n$.
$X_n$ the random variable used to select $A_n$ (generated using a pseudo-random number generator as in Eq. (1) or (2) of Section 3.6).
$P(A_n = a \mid S_{n-1})$ the probability of choosing action $a$ given the previous state.

We must show that the sequence $(A_1, A_2, \dots)$ behaves as a computer-generated random script, subject to the conditional probabilities dictated by the current world state and the rules of the simulation.

Solution

The script can be modeled as a discrete-time stochastic process with state space $\mathcal{S}$ defined by all combinations of player positions, inventories, and object placements. Let $\mathcal{A}(s)$ be the set of legal actions available in state $s \in \mathcal{S}$. At each step, the program performs the following:

State-dependent action selection: Given $S_{n-1} = s$, the program enumerates $\mathcal{A}(s)$ and assigns probabilities $p(a \mid s)$ to each action. These probabilities are influenced by:

The proximity of other agents (e.g., line of sight affects shooting probability).
Inventory constraints (cannot fire a gun if not held in hand).
Environmental context (cannot pick up objects not present at location).

Random number generation: To select $A_n$, the program generates a pseudo-random number $X_n$ using a linear congruential generator (Eq. (1)) or a lagged Fibonacci generator (Eq. (2)), seeded appropriately. For each possible action $a \in \mathcal{A}(s)$, a cumulative distribution function $F(a \mid s)$ is constructed. Then $A_n$ is chosen such that

$$ F(a_{k-1} \mid s) \le \frac{X_n}{m} < F(a_k \mid s), $$

where $m$ is the modulus of the generator and $X_n/m \in [0,1)$. 3. State update: After selecting $A_n$, the program deterministically updates $S_{n-1} \mapsto S_n$ according to the mechanics of the world:

Movement: $\ell_R$ or $\ell_S$ updated if action involves change of location.
Inventory: objects are transferred according to action semantics.
Environmental objects: object locations updated if picked up or placed.

Outcome resolution: Conditional probabilistic outcomes (e.g., hitting or missing a target) are evaluated using additional random numbers. Let $H_n$ be the success indicator of an attempted action (1 = success, 0 = failure), chosen according to $P(H_n = 1 \mid S_{n-1}, A_n)$.
Logging: Each action and its resolved outcome is appended to the script. This produces a sequence ${(A_n, H_n, S_n)}$ which corresponds exactly to the printed lines in the exercise statement.

By the construction above, the program generates a stochastic sequence consistent with the description:

Each script line reflects a deterministic mapping from the internal state $(S_{n-1}, A_n, H_n)$ to textual output.
Randomness enters only through $X_n$, which is a high-quality pseudo-random variable, ensuring that actions are unpredictable yet reproducible if the same seed is used.
Conditional dependencies ensure that impossible actions (violating physics or inventory constraints) never occur, while probabilistic outcomes are properly weighted.

This procedure is precisely an application of the Monte Carlo method described in Section 3.6: a pseudo-random process is used to explore a complex state space and produce statistically representative outcomes.

Verification

To verify the correctness:

State consistency: Every logged action corresponds to a legal update of the current state. Inspection of the provided script shows that:

Guns are only fired if held.
Objects are never created or destroyed except by defined actions.
Locations correspond to prior moves.

Random outcomes: Hits and misses occur with probabilities consistent with positions and prior events. The distribution of events (hits vs. misses) over many trials would converge to the intended probabilities, consistent with the law of large numbers for Monte Carlo sequences.
Reproducibility: If the same seed for $X_0$ is used, the sequence of $X_n$ is deterministic by Eq. (1) or (2), hence the entire script sequence is reproducible, as required for debugging or demonstration purposes.
Independence of actions: Conditional on the state, the selection of each action is independent of other non-relevant actions. This is guaranteed by the pseudo-random generator and the structure of $p(a \mid s)$.

These points confirm that the program satisfies the intended design and produces scripts as described.

Notes

An alternative implementation could use a higher-dimensional lagged Fibonacci generator to improve the statistical independence of successive actions. Furthermore, if multiple parallel scripts were desired, independent seeds could be chosen according to the method described in the discussion following Eq. (2), ensuring non-overlapping sequences.

The formalism presented here generalizes directly to more complex agent-based simulations, including probabilistic planning, constrained movement, and stochastic interactions in a discrete environment.

This completes the proof.

∎