Query Complexity: From Theory of Evolutionary Algorithms to Mastermind

Query Complexity: From Theory of Evolutionary Algorithms to Mastermind

How much do I need to know about a problem to be able to solve it optimally? This question is not so much of an algorithmic nature, but rather pertains to the area of query complexity, which (together with communication complexity and coding theory) considers the fundamental problems of the handling of information.

However, almost every child knows a simple problem of query complexity, namely the game Mastermind. In this game, a player tries to guess the secret color code that the other players have arranged out of four pegs with six possible colors. With each guess, the first player learns the number of positions where the guess agrees with the secret code, but not which positions those are. The challenge in Mastermind is to find the secret code, using this limited information, in as few guesses as possible.

Computer science is interested in problems of query complexity for various reasons. One is that query complexity is clearly a fundamental measure of the difficulty of obtaining sufficient information about an unknown object. It is thus a central concept in theoretical computer science.

Mastermind: an everyday example from the area of query complexity

In addition to a foundationally-based interest, there are also practical grounds for the investigation of query complexities. Evolutionary algorithms and other randomized heuristics attempt to solve problems using generic methods. They have no access to an explicit problem statement; instead, they obtain information about the underlying problem only by generating and assessing possible solutions. Therefore, an evolutionary algorithm for a problem cannot be better than the corresponding query complexity.

Confidentiality issues can also lead to query complexity problems, the focus here naturally being that the unknown information cannot be discovered through the queries. An example of such a problem is the comparison of genetic data. Nobody wants their genetic information to be accessible to others. Therefore, in the comparison of two genome sequences, for example, only vague information about the similarity of the two sequences is disclosed. Just as in the Mastermind game, the question is again posed, how much of this vague information would suffice for the gene sequence to be unlocked.

Aside from fundamental results, our most recent efforts have also yielded an entertaining result, namely the current best strategy for the generalized Mastermind game with n colors and n positions. While up to now different strategies were known for guessing the code in approximately n log(n) questions, in our new strategy, approximately n log(log(n)) questions suffice. Even though this does not yet definitively solve the problem (the best lower bounds imply that one needs at least n questions), our result is nonetheless a great advance on a problem that has fascinated mathematicians and computer scientists for more than 30 years.

Benjamin Doerr

DEPT. 1 Algorithms and ComplexityPhone +49 681 9325-1004Email  doerr@mpi-inf.mpg.de