Friday, June 29, 2012

Solving mastermind with R

In my last post I have shown a solution to classical sorting problem in R. So I thought that this time it would be nice to generate a strategy for playing Mastermind using R.
It was shown by D.E. Knuth that Mastermind code can be broken in at most five guesses. The algorithm is to always choose guess that minimizes the maximum number of remaining possibilities. Here is the R code that implements it.

The game is played using six colors and codes have length four. First part of the code prepares the data for calculations:

# vector with possible peg colors
set <- letters[1:6]

# matrix all possible codes
full <- as.matrix(expand.grid(set, set, set, set))

# compare guess to hidden code
# black - hit with correct position
# white - hit with incorrect position
guessFit <- function(hidden, guess) {
count <- function(pattern) {
sapply(set, function(x) { sum(x == pattern) })
}
black <- sum(full[hidden,] == full[guess,])
white <- sum(pmin(count(full[hidden,]),
count(full[guess,]))) - black
paste(black, white, sep=";")
}

# prepare matrix with all possible
# guess-hidden combinations evaluation
# this is slow: 5 minutes
all.fit <- mapply(guessFit, rep(1:nrow(full), nrow(full)),
rep(1:nrow(full), each = nrow(full)))
dim(all.fit) <- c(nrow(full), nrow(full))

We want to prepare matrix all.fit of all possible guess-hidden code combinations in advance in order to avoid calling guessFit in the main algorithm (WARNING: it takes ~5 minutes on my laptop). Having generated it we can reference codes using their position (row number) in full matrix.

Now let us move on to the main function:

# apply mini-max rule
minimax <- function(possible, indent = 1) {
# if there is only one possibility we are done
if (length(possible) == 1) {
if (indent > worst) {
worst <<- indent
}
cat(full[possible,], "| *\n")
return(1)
}

if (indent == 1) {
cat("1:    ")
}

# for each possible guess find worst case size of set
splits <- sapply(1:nrow(full), function(guess) {
max(table(all.fit[guess, possible])) })
# choose guess that minimizes maximal size of set
best.guess <- which.min(splits)
out.split <- split(possible, sapply(possible, guessFit,
guess = which.min(splits)))
cat(full[best.guess,], "|", length(possible), "\n")
# recursively construct the decision tree
for (i in 1:length(out.split)) {
if (names(out.split)[i] != paste(ncol(full), 0, sep = ";")) {
cat(indent + 1,":", rep("    ", indent),
names(out.split)[i], "|", sep="")
minimax(out.split[[i]], indent + 1)
}
}
}

It recursively constructs the decision tree solving the game and outputs it using cat. At each level of the tree first number of the question asked is printed, next the chosen guess and finally either number of remaining options or a star * indicating a hit. Additionally in variable worst we keep the number of questions that have to be asked in the worst case.

Finally we run the prepared code:

sink("rules.txt") # save output to a file
worst <- 0
minimax(1:nrow(full)) # this is slow: 2 minutes
cat("\nQuestions in worst case:", worst, "\n")
sink()

I redirect output to a file because the resulting tree is quite big (1710 lines) and we can actually see that the game can be solved in five questions in a worst case.

Finally - the code was prepared to make it easy to experiment with the code by changing number of colors and pegs only by changing variables set and full.