In my previous post I tried to create a genetic algorithm (GA). The idea was that the GA would slap together some numbers and symbols to try to hit a target number. For example, some conceivable expressions would be 5 + 9
, 5 * 2
or even something halfway nonsensical like - - 3
.
I was able to write a program that would generate these random expressions and see if their results hit my target number. For example, if my target number was 15, an expression of 3 * 5
would be recognized as a success and my program would stop executing. This was mildly amusing but it was not yet a genetic algorithm. I hadn’t yet fully understood what my GA was supposed to be.
The crucial thing my GA was missing was some form of evolution via sexual reproduction and/or mutation. What I needed was a program that would create a generation of chromosomes, then let that generation create a new generation of fitter offspring, then let that new generation create yet another generation of even fitter offspring, and so on. In this particular case, fitness would be determined by how close each chromosome got to the target number.
After some fiddling I arrived at the following bit of code.
class Chromosome
GENE_MAP = {
'0000' => '0',
'0001' => '1',
'0010' => '2',
'0011' => '3',
'0100' => '4',
'0101' => '5',
'0110' => '6',
'0111' => '7',
'1000' => '8',
'1001' => '9',
'1010' => '+',
'1011' => '-',
'1100' => '*',
'1101' => '/',
}
GENE_LENGTH = 4
GENE_COUNT = 8
TARGET_NUMBER = 100
CHROMOSOME_LENGTH = GENE_COUNT * GENE_LENGTH
attr_accessor :encoding
def initialize(encoding = nil)
if encoding
@encoding = encoding
else
@encoding = CHROMOSOME_LENGTH.times.collect { rand(2) }.join
end
end
def expression
pattern = /.{#{GENE_LENGTH}}/
self.encoding.scan(pattern).collect { |gene|
GENE_MAP[gene]
}.join(' ')
end
def result
eval_expression(self.expression)
end
def fitness_score
return unless self.result
1 / ((TARGET_NUMBER - self.result).abs + 1).to_f
end
def first_part_of_encoding(grab_length)
@encoding[0..(grab_length - 1)]
end
def second_part_of_encoding(grab_length)
@encoding[grab_length..(CHROMOSOME_LENGTH - 1)]
end
def mate_with(chromosome)
grab_length = rand(1..(GENE_COUNT - 1)) * GENE_LENGTH
self.class.new(
first_part_of_encoding(grab_length) +
chromosome.second_part_of_encoding(grab_length)
)
end
end
class Generation
GENERATION_SIZE = 30
attr_accessor :chromosomes
def initialize(chromosomes = [])
number_of_attempts = 0
@chromosomes = chromosomes
if chromosomes.any?
return
end
while number_of_attempts <= GENERATION_SIZE
chromosome = Chromosome.new
if chromosome.result
number_of_attempts += 1
@chromosomes << chromosome
end
end
@chromosomes.sort_by!(&:fitness_score).reverse!
end
def fittest
@chromosomes.first
end
def reproduce
chromosomes = @chromosomes.drop(1).collect { |chromosome|
[fittest.mate_with(chromosome), chromosome.mate_with(fittest)]
}.flatten
.select { |c| c.result != nil }
.sort_by!(&:fitness_score)
.reverse![0..(GENERATION_SIZE - 1)]
Generation.new(chromosomes)
end
end
def print_generation(generation)
generation.chromosomes.each do |chromosome|
print "#{chromosome.encoding} | "
print "#{chromosome.expression.ljust(20, ' ')}| "
print "#{chromosome.result.to_s.rjust(4, ' ')} | "
print "#{chromosome.fitness_score.to_s.rjust(4, ' ')}"
puts
end
nil
end
alias :pg :print_generation
def eval_expression(expression)
begin
value = eval(expression)
if value.is_a? Numeric
value
else
nil
end
rescue SyntaxError, NoMethodError, ZeroDivisionError, TypeError
nil
end
end
Here’s the first generation of chromosomes my program created.
10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001100100010110011110101011111 | 8 * 8 - 3 / 5 | 64 | 0.02702702702702703 10101110010111101110110011110111 | + 5 * 7 | 35 | 0.015151515151515152 01001100001011111100101110110011 | 4 * 2 * - - 3 | 24 | 0.012987012987012988 01101110110000101111101010011111 | 6 * 2 + 9 | 21 | 0.0125 01011010101110111001101110110110 | 5 + - - 9 - - 6 | 20 | 0.012345679012345678 10001010010110111011011110110011 | 8 + 5 - - 7 - 3 | 17 | 0.011904761904761904 11101111101011101010100010101000 | + + 8 + 8 | 16 | 0.011764705882352941 00111101101001111010010010101001 | 3 / + 7 + 4 + 9 | 13 | 0.011363636363636364 01101110111011111111110010100010 | 6 * + 2 | 12 | 0.011235955056179775 10011010010110110101111110100011 | 9 + 5 - 5 + 3 | 12 | 0.011235955056179775 01111110101000011111111011000100 | 7 + 1 * 4 | 11 | 0.011111111111111112 01111011111010101010011010101001 | 7 - + + 6 + 9 | 10 | 0.01098901098901099 10001010111000001100111110011110 | 8 + 0 * 9 | 8 | 0.010752688172043012 01111010111000011110111011010100 | 7 + 1 / 4 | 7 | 0.010638297872340425 10001111101100011010001010110100 | 8 - 1 + 2 - 4 | 5 | 0.010416666666666666 00111010111111100111111011000000 | 3 + 7 * 0 | 3 | 0.01020408163265306 10111010001011010111101011110010 | - + 2 / 7 + 2 | 1 | 0.01 11111010000011011011001111010111 | + 0 / - 3 / 7 | 0 | 0.009900990099009901 11110001110101001101010011010110 | 1 / 4 / 4 / 6 | 0 | 0.009900990099009901 00111111111111010101110110110110 | 3 / 5 / - 6 | 0 | 0.009900990099009901 01101111110111101010111110001111 | 6 / + 8 | 0 | 0.009900990099009901 10110001110010110001111110110010 | - 1 * - 1 - 2 | -1 | 0.00980392156862745 00111010010010101011100110100000 | 3 + 4 + - 9 + 0 | -2 | 0.009708737864077669 00111101111000101101010010110100 | 3 / 2 / 4 - 4 | -4 | 0.009523809523809525 10100001101011100111110010110001 | + 1 + 7 * - 1 | -6 | 0.009345794392523364 10001101011010111010111010011110 | 8 / 6 - + 9 | -8 | 0.009174311926605505 01011100000111001001110110110011 | 5 * 1 * 9 / - 3 | -15 | 0.008620689655172414 01001010010010110100111011001001 | 4 + 4 - 4 * 9 | -28 | 0.007751937984496124 01101101001111101011011011000111 | 6 / 3 - 6 * 7 | -40 | 0.0070921985815602835 11111001110010110110111110110010 | 9 * - 6 - 2 | -56 | 0.006369426751592357
The columns are 1) the chromosome encoding, 2) the expression, 3) the expression’s result, and 4) the chromosome’s fitness score. The fitness score is determined by how close the result is to the target number. Since my target number in this case is 100 and the top chromosome spit out 68, the score in this case is 1 / ((100 – 68) + 1) = 0.0303. (You can see that a result of 100 would be 1 / ((100 – 100) + 1) = 1)
The way the next generation is created is to take the fittest chromosome (8 / 1 * 8 + 4
) and let it bang all the other chromosomes to produce offspring. Here’s the result from that:
10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010101000 | 8 / 1 * 8 + + 8 | 72 | 0.034482758620689655 10001101000111001000101011100111 | 8 / 1 * 8 + 7 | 71 | 0.03333333333333333 10001101000111001000101011100111 | 8 / 1 * 8 + 7 | 71 | 0.03333333333333333 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100010 | 8 / 1 * 8 + 2 | 66 | 0.02857142857142857 10001100100010110011101011100100 | 8 * 8 - 3 + 4 | 65 | 0.027777777777777776 10001101000111001000101010110011 | 8 / 1 * 8 + - 3 | 61 | 0.025 10001101000111001000101010110011 | 8 / 1 * 8 + - 3 | 61 | 0.025 10001101000111001010011010101001 | 8 / 1 * + 6 + 9 | 57 | 0.022727272727272728 01101101000111001000101011100100 | 6 / 1 * 8 + 4 | 52 | 0.02040816326530612 01101101000111001000101011100100 | 6 / 1 * 8 + 4 | 52 | 0.02040816326530612 01011100000111001001101011100100 | 5 * 1 * 9 + 4 | 49 | 0.019230769230769232 00111010010011001000101011100100 | 3 + 4 * 8 + 4 | 39 | 0.016129032258064516 00111010111111100111111011000100 | 3 + 7 * 4 | 31 | 0.014285714285714285 10100001101011100111110011100100 | + 1 + 7 * 4 | 29 | 0.013888888888888888 00111101000111001000101011100100 | 3 / 1 * 8 + 4 | 28 | 0.0136986301369863 00111111111111001000101011100100 | 3 * 8 + 4 | 28 | 0.0136986301369863 10001010010110111011011110110011 | 8 + 5 - - 7 - 3 | 17 | 0.011904761904761904 10001010010110111011011110110100 | 8 + 5 - - 7 - 4 | 16 | 0.011764705882352941 10001101000111111111110010100010 | 8 / 1 * + 2 | 16 | 0.011764705882352941 10001101101001111010010010101001 | 8 / + 7 + 4 + 9 | 14 | 0.011494252873563218 10011010010110110101101011100100 | 9 + 5 - 5 + 4 | 13 | 0.011363636363636364 10001110101000011111111011000100 | 8 + 1 * 4 | 12 | 0.011235955056179775 11101111000111001000101011100100 | 1 * 8 + 4 | 12 | 0.011235955056179775 01011010101110111001101111100100 | 5 + - - 9 - 4 | 10 | 0.01098901098901099 10101110010111101110101011100100 | + 5 + 4 | 9 | 0.010869565217391304 10001101000110110011110101011111 | 8 / 1 - 3 / 5 | 8 | 0.010752688172043012 01111010111000011110111011010100 | 7 + 1 / 4 | 7 | 0.010638297872340425
As you can see, the average fitness has gone up. The fittest member of this generation is also fitter than the fittest member of the previous generation. Let’s run it again.
10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010101000 | 8 / 1 * 8 + + 8 | 72 | 0.034482758620689655 10001101000111001000101011100111 | 8 / 1 * 8 + 7 | 71 | 0.03333333333333333 10001101000111001000101011100111 | 8 / 1 * 8 + 7 | 71 | 0.03333333333333333 10001100100010110011101010011110 | 8 * 8 - 3 + 9 | 70 | 0.03225806451612903 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100100 | 8 / 1 * 8 + 4 | 68 | 0.030303030303030304 10001101000111001000101011100010 | 8 / 1 * 8 + 2 | 66 | 0.02857142857142857 10001101000111001000101010110011 | 8 / 1 * 8 + - 3 | 61 | 0.025 10001101000111001000101010110100 | 8 / 1 * 8 + - 4 | 60 | 0.024390243902439025 10001101000111001000101111100100 | 8 / 1 * 8 - 4 | 60 | 0.024390243902439025 01101101000111001000101010011110 | 6 / 1 * 8 + 9 | 57 | 0.022727272727272728 10001101000111001010011010101001 | 8 / 1 * + 6 + 9 | 57 | 0.022727272727272728 10101110010111001000101010011110 | + 5 * 8 + 9 | 49 | 0.019230769230769232 01011100000111001000101010011110 | 5 * 1 * 8 + 9 | 49 | 0.019230769230769232 01011101000111001000101010011110 | 5 / 1 * 8 + 9 | 49 | 0.019230769230769232
This time the fittest member is no fitter than the fittest member of the previous generation but the average fitness has gone up. If we run the code again, all the chromosomes end up with equal fitness.
10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001100000111001000101010011110 | 8 * 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571 10001101000111001000101010011110 | 8 / 1 * 8 + 9 | 73 | 0.03571428571428571
At this point no more evolution is possible without mutation, and since my program doesn’t involve mutation, no more evolution is possible with my program. So 73 is as close to 100 as we’re going to get on this try.
What Next?
I originally embarked upon this exercise because I had been vaguely curious for years about how GAs work. This exercise has given me enough understanding to satisfy that curiosity.
I can imagine some interesting places to go next. First, it seems like a flaw that there’s a less-than-100% chance of my program generating a 100%-fit chromosome for this relatively simple case. Maxing out at 73 seems like a pretty weak result.
Second, it would be interesting to apply a GA to something more entertaining than finding a number close to 100. It would be cool to hook a GA up to some sort of physics engine and create organisms that walk or slither or swim.