Bulls and Cows (also known as MOO) is a 2-player game in which one player comes up with a secret and the other has to guess the secret. The secret consists of 4 digits from 0 to 9, where each digit is distinct. Player 2 has to guess these 4 digits, in the right order. At each guess from player 2, player 1 provides feedback as hints. The hints are two numbers: one telling how many digits from the secret player 2 got in the right order (bull), and another telling how many digits they got but in the wrong order (cow).
For example, if player 1 came up with
4271, and player 2 guessed
1234, then bull is 1 (digit 2), and cow is 2 (1 and 4 are in the secret, but not in the order guessed by player 2).
The goal of the game is for player 2 to guess the secret with the least amount of guesses.
In this post we’ll present computational experiments to solve a game of Bulls and Cows optimally.
We’ll focus on a search strategy to minimize the maximum number of guesses one has to make, for any possible secret. Alternatively, John Francis’s paper  proposes heuristics that minimize the expected number of guesses, that is, for the worst case these heuristics might not be optimal, but they perform better on average.
Because the game has turns, our solution is not a single list of guesses, but rather a decision tree where at each node we branch depending on the hint we get back. The metric we are trying to optimize is then the height of such tree.
To break ties between trees of the same height, we consider the smallest tree (the one with least nodes).
Our search algorithm is recursive. At any given recursion level, we are given a set containing all possible numbers that could be secrets. We’ll try all combinations of guesses and hints and see which ones yield the best solution.
When we simulate a given guess and hint, we are restricting the possible numbers that can still be secret, so in the next level of recursion, the set of potential secrets will be smaller.
For example, say that the possible secrets are any 4-digit number with no repeated digits. We then use one of them as a guess, say,
[0, 1, 2, 3]. One possible hint we could receive is (3, 0). What are the possible secrets that would cause us to receive this hint?
[0, 1, 2, 4],
[0, 1, 2, 5] and
[7, 1, 2, 3] are a few of those. If we recurse for this guess and hint, the set of secrets in the next level will be restricted to those that would return a hint (3, 0) for
[0, 1, 2, 3].
Continuing our example, is
[0, 1, 2, 3] the best guess we can make at that point? After recursing for all possible hints, we’ll have a decision tree rooted on
[0, 1, 2, 3]. The key idea of the search is that we can minimize the height of the final decision tree by minimizing the subtrees at each recursion level (greedy strategy). Thus, we want to find the guess with the shortest subtree.
In pseudo-code code, the search looks like this:
We start with all possible secrets (4-digit number with no repeated digits) and the tree returned should be optimal in minimizing the number of guesses for the worst case.
I initially implemented the search in Python, but it was taking too long, even when artificially restricting the branching. I re-implemented it in Rust and saw some 20x speedups (caveat: I haven’t really tried to optimize the Python version).
The main difference between the pseudo-code and the actual implementation is that we pre-group
possible_secrets by their hint for guess, which is more efficient than scanning
possible_secrets for all possible hints:
group_possibilities_by_score() above makes use of
compute_score and it also uses a fixed-length array for performance. The set of hints is proportional to the squared size N of the guess, in our case
Turns out that the Rust implementation is still not efficient enough, so we’ll need further optimizations.
Optimization – classes of equivalence
What is a good first guess? It doesn’t matter! We don’t have any prior information about the secret and every valid number is equality probable. For example, if we guess
[0, 1, 2, 3] or
[5, 1, 8, 7], the height of the decision tree will be the same. An intuitive way to see why this is the case is that we could relabel the digits such that
[5, 1, 8, 7] would map to
[0, 1, 2, 3].
Francis  generalizes this idea for other cases. Say that at a given point we made guesses covering the digits 0, 6, 7, 8 and 9 at least once. Now say we our next guess is
[0, 8, 9, 3]. In here, 3 is the only digit we haven’t tried yet, but using the re-labeling argument, we can see that
[0, 8, 9, 1] would yield the same decision tree if we were to swap the labels of 1 and 3. This allow us to skip guesses that belong to the same class, which reduces the branch factor.
We can generate an ID representing a given class. A way to do this is by adding one to each digit we have tried before and making any digit we haven’t as 0, then converting that number from base 11 to base 10. For example,
[0, 8, 9, 1] becomes
[1, 9, 10, 0]. If this was a number in base 11, in base 10 it is 2530 (
(((1*11) + 9)*11 + 10)*11 + 0). If we do the same with
[0, 8, 9, 3], we’ll get the same number. The code below implements this idea.
In our case,
D = 10 and we store the set of visited digits in a bitset,
visited_bits (that is, bit
i is 1 if digit
i has been visited.
On the search side, we keep a set of classes ids already visited and skip a guess if its class is already in there.
With this optimization the search algorithm runs in under 2 minutes. By inspecting the height of the resulting tree we conclude that the minimum number of guesses necessary for any secret is 7.
The complete code is available on Github.
The JSON output by the search algorithm is quite big (the smallest tree with height 7 has almost 7000 nodes). A more interesting way to inspect the data is to create a Bulls and Cows solver. We feed the JSON to this application and ask the user to select the outcome based on the secret they have in mind. We are basically traversing the edges of the decision tree.
I learned about this game pretty recently and was curious to learn of good strategies to solve this problem. From what I’ve been reading, the heuristics that yield good solutions are very complicated for a human to perform. This leads to an question: is there are any solution which is simple to follow but that is reasonably good?
One natural extension for this problem is to use larger numbers of digits (
N > 4) and have each element be sourced from 0 to
D – 1. An exhaustive search might be prohibitive, but maybe we can come up with heuristics with constant guarantees. What is a lower bound for the number of guesses for variants with arbitrary N and D?
I struggled to implement this code in Rust, but I found the challenge worthwhile. I learned a bit more about its specifics, especially regarding memory management and data types.
In , the author mentions that with optimizations the search took 45 minutes to run on their laptop (2GHz Intel Core 2 Duo). I ran mine on a Intel i7 2.2GHz and was surprised by the running time of 2 minutes. CPUs are not getting exponentially faster these days and my code runs on a single thread.