Genetic Algorithms — Core Concepts

Selection, crossover, mutation, and fitness functions — how to implement and tune genetic algorithms in Python for real optimization problems.

How a genetic algorithm works

A genetic algorithm (GA) maintains a population of candidate solutions that evolves over generations. Each generation:

Evaluate fitness — score every individual in the population
Select parents — choose individuals to reproduce, biased toward higher fitness
Crossover — combine pairs of parents to create offspring
Mutate — randomly alter some offspring
Replace — form the next generation from offspring (and possibly some parents)

Encoding solutions

Every solution must be represented as a “chromosome” — a data structure the algorithm can manipulate. Common encodings:

Encoding	Example problem	Representation
Binary string	Feature selection	`[1, 0, 1, 1, 0]`
Real-valued	Function optimization	`[3.14, -2.7, 0.5]`
Permutation	Traveling salesman	`[3, 1, 4, 2, 0]`
Tree	Symbolic regression	Expression tree

The encoding choice determines which crossover and mutation operators make sense.

Fitness function

The fitness function is the most important component — it defines what “good” means. For minimization problems, lower fitness is better. For maximization, higher is better.

import numpy as np

# Example: minimize the Rastrigin function (many local minima)
def fitness(x):
    A = 10
    n = len(x)
    return -(A * n + sum(xi**2 - A * np.cos(2 * np.pi * xi) for xi in x))
    # Negated because we select for higher fitness

Selection methods

Tournament selection

Pick k random individuals, keep the best. Simple, tunable (larger k = more selective):

def tournament_select(population, fitnesses, k=3):
    indices = np.random.choice(len(population), size=k, replace=False)
    best = indices[np.argmax(fitnesses[indices])]
    return population[best]

Roulette wheel

Probability of selection proportional to fitness. Works poorly when fitness values are close together.

Rank-based

Selection probability based on rank rather than raw fitness. More robust to fitness scaling issues.

Crossover operators

Single-point crossover

Pick a random point and swap everything after it between two parents:

def crossover(parent1, parent2):
    point = np.random.randint(1, len(parent1))
    child1 = np.concatenate([parent1[:point], parent2[point:]])
    child2 = np.concatenate([parent2[:point], parent1[point:]])
    return child1, child2

Two-point crossover

Swap a random segment between two points. Preserves more structure from each parent.

Uniform crossover

Each gene comes from either parent with 50% probability. Maximum mixing.

Order crossover (for permutations)

Preserves relative ordering — essential for problems like the traveling salesman where position matters.

Mutation

Mutation introduces randomness to maintain diversity and prevent the population from converging prematurely to a local optimum.

def mutate(individual, mutation_rate=0.01, scale=0.5):
    mask = np.random.random(len(individual)) < mutation_rate
    noise = np.random.normal(0, scale, len(individual))
    individual[mask] += noise[mask]
    return individual

Mutation rate is a balancing act: too low and the population stagnates; too high and good solutions get destroyed.

A complete example

def genetic_algorithm(fitness_fn, n_dims, pop_size=100, generations=200,
                      mutation_rate=0.05, crossover_rate=0.8):
    # Initialize random population
    population = np.random.uniform(-5, 5, (pop_size, n_dims))
    
    for gen in range(generations):
        fitnesses = np.array([fitness_fn(ind) for ind in population])
        
        # Elitism: keep best individual
        best_idx = np.argmax(fitnesses)
        next_gen = [population[best_idx].copy()]
        
        while len(next_gen) < pop_size:
            # Select parents
            p1 = tournament_select(population, fitnesses)
            p2 = tournament_select(population, fitnesses)
            
            # Crossover
            if np.random.random() < crossover_rate:
                c1, c2 = crossover(p1, p2)
            else:
                c1, c2 = p1.copy(), p2.copy()
            
            # Mutate
            c1 = mutate(c1, mutation_rate)
            c2 = mutate(c2, mutation_rate)
            
            next_gen.extend([c1, c2])
        
        population = np.array(next_gen[:pop_size])
    
    fitnesses = np.array([fitness_fn(ind) for ind in population])
    return population[np.argmax(fitnesses)]

best = genetic_algorithm(fitness, n_dims=10)

When to use genetic algorithms

GAs work well when:

The search space is large and poorly understood
The fitness landscape has many local optima
No gradient information is available
You need “good enough” solutions, not provably optimal ones
The problem has discrete or mixed discrete-continuous variables

They work poorly when:

The problem is smooth and convex (gradient descent is faster)
Evaluating fitness is very expensive (GAs need thousands of evaluations)
You need a guaranteed optimal solution

Common misconception

Many people assume genetic algorithms always find the global optimum. They do not — they are heuristics with no convergence guarantee. A GA might get stuck in a good-but-not-best region if diversity is lost too early. Techniques like niching, crowding, and restart strategies help but do not eliminate this risk.

One thing to remember: A genetic algorithm trades mathematical guarantees for flexibility — it can optimize almost anything you can score, but tuning population size, selection pressure, and mutation rate is the real skill.

pythonalgorithmsoptimizationevolutionary-computing