BibTeX-Entries to my publications

All these papers were written together with Heinz Mühlenbein

Mathematical Analysis of Evolutionary Algorithms, Heinz Mühlenbein, Thilo Mahnig

We analyze evolutionary algorithms which use a population of search points
at each step. A popular example are genetic algorithms. We approximate
genetic algorithms by an algorithm using *Univariate Marginal
Distributions* (UMDA). UMDA generates the search
points from a probability distribution. We derive difference equations for
the marginal distributions which completely describe the dynamic behavior of
UMDA. The key result is that UMDA transforms the discrete optimization
problem to a continuous one. The continuous optimization problem is solved
by gradient ascent. We show that UMDA solves many difficult multi-modal
optimization problems. UMDA is extended to the *Factorized Distribution
Algorithm* (FDA) which uses a general factorization of the search
distribution into conditional and marginal distributions. We prove that FDA
with Boltzmann selection converges to the set of global optima. A new
adaptive selection schedule for Boltzmann selection is derived. FDA is
extended to an algorithm LFDA, which computes the factorization from
promising search points. LFDA is based on learning of Bayesian networks.

Optimal Mutation Rate Using Bayesian Priors for Estimation of Distribution Algorithms, Thilo Mahnig, Heinz Mühlenbein

UMDA (the univariate marginal distribution algorithm) was derived by analyzing the mathematical principles behind recombination. Mutation, however, was not considered. The same is true for the FDA (factorized distribution algorithm), an extension of the UMDA which can cover dependencies between variables. In this paper mutation is introduced into these algorithms by a technique called Bayesian prior. We derive theoretically an estimate how to choose the Bayesian prior. The recommended Bayesian prior turns out to be a good choice in a number of experiments. These experiments also indicate that mutation increases in many cases the performance of the algorithms and decreases the dependence on a good choice of the population size.

(c) Springer-Verlag, LNCS

Evolutionary Computation and Beyond, Heinz Mühlenbein, Thilo Mahnig

This is an overview article of the theoretical results derived in the Real World Computing Partnership.

Evolutionary Synthesis of Bayesian Networks for Optimization, Heinz Mühlenbein, Thilo Mahnig

We shortly review our theoretical analysis of genetic algorithms and provide some new results. The theory has lead to the design of three different algorithms, all based on probability distributions instead of recombination of strings. In order to be numerically tractable, the probability distribution has to be factored into a small number of factors. Each factor should depend on a small number of variables only. For certain applications the factorization can be explicitly determined. In general it has to be determined from the search points used for the optimization. Computing the factorization from the data leads to learning Bayesian networks. The problem of finding a minimal structure which explains the data is discussed in detail. It is shown that the Bayesian Information Criterion is a good score for this problem. The algorithms are extended to probabilistic prototype trees used for synthesizing programs.

A new adaptive Boltzmann selection schedule SDS, Thilo Mahnig, Heinz Mühlenbein

FDA (the Factorized Distribution Algorithm) is an evolutionary algorithm that combines mutation and recombination by using a distribution. The distribution is estimated from a set of selected points. It is then used to generate new points for the next generation. In general a distribution defined for n binary variables has 2^n parameters. Therefore it is too expensive to compute. For additively decomposed discrete functions (ADFs) there exists an algorithm that factors the distribution into conditional and marginal distributions, each of which can be computed in polynomial time. Previously, we have shown a convergence theorem for FDA. But it is only valid using Boltzmann selection. Boltzmann selection was not used in practice because a good annealing schedule was lacking. Using a Taylor expansion of the average fitness of the Boltzmann distribution, we have developed an adaptive annealing schedule called SDS (standard deviation schedule) that is introduced in this work. The inverse temperature beta is changed inversely proportional to the standard deviation.

Mathematical Analysis of Evolutionary Algorithms for Optimization, Heinz Mühlenbein, Thilo Mahnig

Simulating evolution as seen in nature has been identified as one of the key computing paradigms for the new decade. Today evolutionary algorithms have been successfully used in a number of applications. These include discrete and continuous optimization problems, synthesis of neural networks, synthesis of computer programs from examples (also called genetic programming) and even evolvable hardware. But in all application areas problems have been encountered where evolutionary algorithms performed badly. In this survey we concentrate on the analysis of evolutionary algorithms for optimization. We present a mathematical theory based on probability distributions. It gives the reasons why evolutionary algorithms can solve many difficult multi-modal functions and why they fail on seemingly simple ones. The theory also leads to new sophisticated algorithms for which convergence is shown.

Comparing the adaptive Boltzmann selection schedule SDS to truncation selection, Thilo Mahnig, Heinz Mühlenbein

FDA (the Factorized Distribution Algorithm) is an evolutionary algorithm that combines mutation and recombination by using a distribution. The distribution is estimated from a set of selected points. It is then used to generate new points for the next generation. FDA uses a factorization to be able to compute the distribution in polynomial time. Previously, we have shown a convergence theorem for FDA. But it is only valid using Boltzmann selection. Boltzmann selection was not used in practice because a good annealing schedule was lacking. Using a Taylor expansion of the average fitness of the Boltzmann distribution, we have developed an adaptive annealing schedule called SDS. The inverse temperature beta is changed inversely proportional to the standard deviation. In this work, we compare the resulting scheme to truncation selection both theoretically and experimentally with a series of test functions. We find that it behaves similar in terms of complexity, robustness and efficiency.

Wright's Equation and Evolutionary Computation, Heinz Mühlenbein, Thilo Mahnig

In this paper Wright's equation, formulated in 1931 is
proven and applied to evolutionary computation. Wright's equation shows
that evolution is doing gradient ascent in the landscape defined by the
average fitness of the population. The average fitness W is defined in
terms of marginal gene frequencies p_{i}. Wright's equation is only
approximately valid in population genetics, but it is exactly describing
the behavior of our Univariate Marginal Distribution Algorithm (UMDA).

Application of Wright's Equation in Evolutionary Computation, Heinz Mühlenbein, Thilo Mahnig

In this paper we apply Wright's equation to a specific fitness function defined by Wright himself. We introduce mutation into Wright's equation and our Univariate Marginal Algorithm UMDA. We show that mutation moves the stable attractors from the boundary into the interior. We compute optimal mutation rates for proportionate selection and truncation selection.

Evolutionary Algorithms: From Recombination to Search Distributions, Heinz Mühlenbein, Thilo Mahnig

First we show that all genetic algorithms can be
approximated by an algorithm which keeps the population in linkage
equilibrium. Here the genetic population is given as a product of
univariate marginal distributions. We describe a simple algorithm
which keeps the population in linkage equilibrium. It is called the
Univariate Marginal Distribution Algorithm (UMDA). Our main result
is that UMDA transforms the discrete optimization problem into a
continuous one defined by the average fitness *W(p _{1},...,p_{n})*
as a function of the univariate marginal
distributions

Mathematical analysis of optimization methods using search distributions, Thilo Mahnig, Heinz Mühlenbein

We show that UMDA transforms the discrete optimization problem f(x) into a continuous one defined by the average fitness W(p). For proportionate selection, UMDA performs gradient ascent in this landscape. For functions with highly correlated variables UMDA has to be extended to an algorithm FDA which uses more complex search distributions. FDA also transforms the discrete optimization problem into a continuous one defined by W(p), where W(p) now depends on the factorization. The difference between UMDA and FDA are discussed for a deceptive function.

Evolutionary Optimization using Graphical Models, Heinz Mühlenbein, Thilo Mahnig

We have previously shown that a genetic algorithm can be approximated by an evolutionary algorithm using the product of univariate marginal distributions of selected points as search distribution. This algorithm (UMDA) successfully optimizes difficult multi-modal optimization problems. For correlated fitness landscapes more complex factorizations of the search distribution have to be used. These factorizations are used by the Factorized Distribution Algorithm FDA. In this paper we extend FDA to an algorithm which computes a factorization from the data. The factorization can be represented by a Bayes network. The Bayes network is used to generate the search points.

The Factorized Distribution Algorithm for Additively Decomposed Functions,Heinz Mühlenbein, Thilo Mahnig

FDA - the Factorized Distribution Algorithm - is an evolutionary algorithm that combines mutation and recombination by using a distribution. First the distribution is estimated from a set of selected points. It is then used to generate new points for the next generation. In general a distribution defined for n binary variables has 2^n parameters. Therefore it is too expensive to compute. For additively decomposed discrete functions (ADFs) there exists an algorithm that factors the distribution into conditional and marginal distributions, each of which can be computed in polynomial time. The scaling of FDA is investigated theoretically and numerically. The scaling depends on the ADF structure and the specific assignment of function values. Difficult functions on a chain or a tree structure are optimized in about O(n sqrt(n)) function evaluations. More standard genetic algorithms are not able to optimize these functions. FDA is not restricted to exact factorizations. It also works for approximate factorizations.

Convergence Theory and Applications of the Factorized Distribution Algorithm ,Heinz Mühlenbein, Thilo Mahnig

The paper investigates the optimization of additively decomposable functions (ADF) by a new evolutionary algorithm called Factorized Distribution Algorithm (FDA). FDA is based on a factorization of the distribution to generate search points. First separable ADFs are considered. These are mapped to generalized linear functions with metavariables defined for multiple alleles. The mapping transforms FDA into an Univariate Marginal Frequency Algorithm (UMDA). For UMDA the exact equation for the response to selection is computed under the assumption of proportionate selection. For truncation selection an approximate equation for the time to convergence is used, derived from an analysis of the OneMax function. FDA is also numerically investigated for non separable functions. The time to convergence is very similar to separable ADFs. FDA outperforms the genetic algorithm with recombination of strings by far.

It is shown that additively decomposed functions with disjunct defining sets can be mapped to generalized linear functions with metavariables defined for multiple alleles. The mapping transforms the Factorized Distribution Algorithm in the original space into an Univariate Marginal Frequency Algorithm (UMDA) in the space of the metavariables. For UMDA with proportionate selection the exact equation for the response is computed. For generalized linear functions the response is given by the additive genetic variance divided by the average fitness of the population. This was conjectured by Fisher in the Fundamental Theorem of Natural Selection.

Schemata, Distributions and Graphical Models in Evolutionary Optimization,Heinz Mühlenbein, Thilo Mahnig , Alberto Ochoa Rodriguez

The paper deals with the optimization of additively decomposable discrete functions where genetic algorithms have exhibited a poor performance. First the schema theory of genetic algorithms is reformulated in probability theory terms. A schema defines the structure of a marginal distribution. A new algorithm FDA is presented which is based on factorizing a probability model into marginal distributions. The probability model captures the structure of the given function. The theory of conditional independence graphs is used to classify the fitness functions which can be optimized in polynomial time. For the test functions considered, the performance of FDA - in number of generations till convergence - is similar to that of the simple genetic algorithm for the OneMax function. This result is theoretically explained.

FDA - A scalable evolutionary algorithm for the optimization of additively decomposed functions,Heinz Mühlenbein, Thilo Mahnig

FDA - the Factorized Distribution Algorithm - is an evolutionary algorithm which combines mutation and recombination by using a distribution instead. First the distribution is estimated from a set of selected points. Then it is used to generate the new points for the next generation. In general a distribution defined for n binary variables has

The scaling of FDA is investigated theoretically and numerically. The
scaling depends on the ADF structure and the specific assignment of function
values. Difficult functions on a chain or a tree structure are solved in
about *O(n*sqrt(n)*). More standard genetic algorithms are not able
to optimize these functions. FDA is not restricted to exact factorizations.
It also works for approximate factorizations as is shown for a circle
and a grid structure.

Papers describing the research at GMD on genetic algorithms can be found here .