brainhack April 11, 2016 at 21:31

What is grammatical evolution + easy implementation

Most recently, I wrote an article in which, without explanation, showed what the grammatical evolution method is capable of. I completely agree that this cannot be done, but I wanted to show the results of an interesting method. I thought "what will be better: translate the source or give your own explanation." Laziness prevailed.

If someone is interested in evolutionary methods and the task of symbolic regression (and not only), then please read.

Bakus-Naur Form

First, it must be said about what context-independent grammar is in the form of Bakus-Naur (abbreviated as BNF). About formal languages and grammars on Habré there is already quite an interesting article . Highly recommend reading. But our goal is to understand what BPF is and learn how to use this form.

Wikipedia gives a completely adequate definition:

BNF is a formal syntax description system in which some syntactic categories are sequentially defined through other categories. BNF is used to describe context-free formal grammars.

BNF has terminal and nonterminal symbols. Terminals are constants. We set them up and that's it, but with nonterminal symbols everything is much more interesting: they can be substituted into each other according to the rules.

Consider an example. We have the following formal system for describing the syntax of context-free grammar:

$<sntnce> :: = <sntnce> | <noun><verb> | <adverb><verb>$

$<noun> :: = Peter \, | \, ball$

$<verb> :: = ran \, | \, fell$

$<adverb> :: = quickly$

In our example, many nonterminal characters

$N = \ {<sntnce>, <noun>, <verb>, <adverb> \}.$

A lot of terminal symbols are represented as follows

$T = \ {Peter, \, ball, \, quickly, \, ran, \, fell \}$

The set S will contain one element.

$S = \ {<sntnce> \}$

This element will be the entry point for building and working with the rules.

Now, having one nonterminal element, we can make constructions that will correspond to our grammar.

We follow the chains

1) => => quickly => quickly ran

2) => => Peter => Peter fell

The question arises: on what basis do these substitutions occur? I will answer this question a little later.

Genetic algorithm

It seems to me that this algorithm is canonical in evolutionary circles. It is easy to implement and works well. Consideration of this algorithm is necessary in order to understand what kind of mechanism will be as an “engine” in the method of grammatical evolution. But (!!) in its place can be any other evolutionary algorithm convenient for you.

So GA uses the behavior of nature. In fact, nothing new was invented. This algorithm has been running for millions of years. Thanks to him. After all, if not for him, then we would not have been.

GA consists of several stages.

(1) Creation of an initial population (we create a chromosome for each individual)

(2) Calculation of fitness function for each individual (it shows exactly who is best suited in this population)

(3) Selection of the best representatives for the formation of further offspring

(4) Crossover

(5) Mutation

(6) After (4) we get children, some of which go through (5). At the output we get offspring

(7) Selection of fathers and children in a new generation

(8) return to step (2), if the values that children give us are not satisfied with the

Chromosome - an encoded representation of the information we need. The primary source uses a binary representation. Those. if we need to find 4 parameters (each of them is in the range from 0 to 15), then for each parameter we need 4 bits (zero or one). And the chromosome itself will be a length of 16. Everything is pretty primitive.

Important : you can use a decimal representation. I will do so for grammatical evolution.

Now a little about the operators in the GA and all sorts of fitness functions.

Fitness function - functionality that we must optimize. It varies from task to task. If there is a question of minimizing functionality, then parents who have as little fitness function as possible are needed for selection.

Crossover is a cool thing. In genetic programming, by the way, this operator is almost the only one for obtaining offspring with the best qualities. The bottom line is that we take two parents (or rather their genotype). Divide it in half. And swap. I will show you now.

There are two lists:

$first \, parent = [1,2,3,4, -5, -6, -7, -8,]$

$Second \, parent = [-1, -2, -3, -4,5,6,7,8]$

$Result:$

$Child \, 1 = [1,2,3,4,5,6,7,8]$

$Child \, 2 = [-1, -2, -3, -4, -5, -6, -7, -8]$

This was an example of a point crossover. There are other variations on this subject, but we will not talk about them.

Mutation is the process of accidentally replacing a particular gene.

$was = [1,2,3,4]$

$be = [1, -5,3,4]$

A frequently used method of selection for a new generation is elitist. We just take the n best individuals from the list of children + parents. And then we supplement the population at random to the right amount.

Important: the size of the chromosome can be either fixed or arbitrary. The same goes for population size.

Grammatical evolution

And now about the most important thing. What kind of method is this and what does it eat with.
The bottom line is that you have a problem to solve. You are building a grammar in the form of Bakus-Naur. Create an initial population in which each individual will have his own chromosome, which describes what rules when, where, substitute. The importance is that thanks to these rules you get a function that you can use for calculations. The value of this function with the parameters predefined (or not) substituted into it can act as a functional (fitness function). The better the functional, the better the function, and therefore the individual with his chromosome.

More about the chromosome.

Let us have the following grammar

:: = |

:: = x | 3 | 2

:: = + | - | * | /

Next we have such a chromosome

$chromo = [2,1,6,4,5,1]$

Initially, the symbolic representation of the function contains one nonterminal symbol: H = (usually).

We take the first element from chromo: 2. We consider how many rules are in the conversion to: 2. Divide 2% 2 (modulo !!) = 0. So instead of we substitute . Thus, H =. We throw out a deuce from chromo. She is no longer needed.

Next in line is one. and look again at. 1% 2 (number of substitution rules) = 1. So instead of we substitute . We get H =.

If you do these simple manipulations further, you get such a chain

$<val><op><e> (6 \% 3 = 0) -> x <op><e>$

$x <op><e> (4 \% 4 = 0) -> x + <e>$

$x + <e> (5 \% 2 = 1) -> x + <val>$

$x + <val> (1 \% 3 = 1) -> x + 3$

$H = x + 3.$

Then we use this function to calculate the functional and determine how much an individual suits us with a given phenotypic (as the function is called) and genotype (chromosome).

It's all. Yes, there are several options for substitution: in depth (considered), in width, the so-called $\ pi$ substitution. But this is material for a separate article.

And now an example.

There is a time interval $t \ in [-5,5]$ . There is a function $y (x) = 1 + x + x ^ 2 + x ^ 3$ . We need to find a function that would give the minimum quadratic error - the functional of this problem.

Let's look at the

main module code

# -*- coding: utf-8 -*-
import GE
def foo(x):
  return 1+x+x**2+x**3
interval = [-5, 5]
values =[foo(e) for e in range(interval[0], interval[1])]
if __name__ == "__main__":
  result = GE.GA(
    chromo_len=12,
    pop_len=100,
    count=20
  )[0]
  print(result)

The foo function generates values that we will compare with those that will be obtained from the functions created by each individual.

Chromosome Length = 15;

Population Length = 100;

The number of iterations (evolutions) = 20;

Parser module

# -*- coding: utf-8 -*-
import random
import re
def rand(num):
    return int(random.random()*num)
def parse(chromo):
  l = len(chromo)
  j = 0
  h = ""
  grammar = {
    "": ["", ""],
    "": ["+", "-", "*", "/"],
    "": ["x", "1"]
  }
  s = r"<+["+('|'.join(list(map(lambda x: x[1:-1], grammar.keys()))))+"]+>"
  pattern = re.compile(s)
  while j < l:
    elems = pattern.findall(h)
    if not elems:
      break
    el = elems[0]
    h = h.replace(el, grammar[el][(chromo[j] % int(len(grammar[el])))], 1)
    j += 1
  while True:
    elems = pattern.findall(h)
    if elems:
      for elem in elems:
        el = elem
        if elem == "":
          el = ""
        h = h.replace(elem, grammar[el][rand(len(grammar[el]))], 1)
    else:
      break
  return h

The grammar dictionary contains rules for grammar. Further, the substitution algorithm, which I described above. After the block While is needed to complete the function. There are times when the chromosome is finished, but nonterminal symbols remain. Here is the last cycle and is needed in order to replace all non-terminals with terminals.
Important : it is not a fact that the final function will be valid from the point of view of semantics (it can divide by zero and all that).

GE module

# -*- coding: utf-8 -*-
import random
import math
from parser import parse
def rand(num):
  return int(random.random() * num)
class Individ:
  def __init__(self, genotype):
    self.genotype = genotype
    self.phenotype = self.get_phenotype()
    self.fitness = self.get_mistake()
  def __eq__(self, other):
    return self.fitness == other.fitness
  def __ne__(self, other):
     return not self.__eq__(other)
  def __lt__(self, other):
    return self.fitness < other.fitness
  def __ge__(self, other):
    return not self.__lt__(other)
  def __str__(self):
    return "chromosome : {0}\nphenotype = {2}\nfitness = {1}\n".format(self.genotype, self.fitness, self.phenotype)
  def get_phenotype(self):
    return parse(self.genotype)
  def get_mistake(self):
    import main
    intr = main.interval
    vals = main.values
    f = eval("lambda x: {0}".format(self.phenotype))
    f_vals = []
    for i in range(intr[0], intr[1]):
      try:
        f_vals.append(f(i))
      except:
        return 10000
    return sum(list(map(lambda e: (e[0] - e[1]) ** 2, list(zip(vals, f_vals)))))
def GA(chromo_len, pop_len, count):
  population = get_population(pop_len, chromo_len)
  while count > 0:
    childrn_chromos = get_children_chromose(population)
    mutation(childrn_chromos, 0.8)
    children = get_children(childrn_chromos)
    population = get_new_population(population, children)
    count -= 1
  return population
def get_genotype(gen_length):
  return [rand(200) for _ in range(gen_length)]
def get_population(length, chromo_len):
  return [Individ(get_genotype(chromo_len)) for _ in range(length)]
def get_children_chromose(parents):
  def tournament():
    return [min([parents[rand(len(parents)-1)] for _1 in range(7)]) for _2 in range(2)]
  children_chromo = []
  for _3 in range(int(len(parents)/2)):
    children_chromo += crossover(tournament())
  return children_chromo
def get_children(childrn_chromos):
  return [Individ(childrn_chromos[i]) for i in range(len(childrn_chromos))]
def crossover(pair):
  p1 = pair[0]
  p2 = pair[1]
  d = random.randint(1, len(p1.genotype) - 2)
  return [p1.genotype[:d] + p2.genotype[d:], p2.genotype[:d] + p1.genotype[d:]]
def mutation(chldrn_chromo, p):
  for j in range(len(chldrn_chromo)):
    if p > random.random():
      i = random.randint(1, len(chldrn_chromo[j]) - 1)
      chldrn_chromo[j][i] = rand(200)
  return chldrn_chromo
def get_new_population(population, children):
  l = len(population)
  buf = population+children
  buf.sort()
  c = math.trunc(0.2*l)
  new_population = buf[:c]
  tmp = buf[c:]
  ll = len(tmp)
  new_population += [tmp[random.randint(1, ll - 1)] for _ in range(c, l)]
  return new_population

The usual implementation of a genetic algorithm.

The GA function is an entry point into the cycle of evolution.

In general, the functions speak for themselves, and the implementation of each function is not very long. I note that the selection for the crossover is not standard. I just bother my parents and choose the first half of the mixed pile. This is not very harmful for this task, but it is better not to do so. There are a dozen (or even more) methods that allow you to select the best candidates for the crossover.

Each individual has three fields: genotype, phenotype, fitness function value.

As a result,

I get the function

$x + x / 1 * x + x * x * 1 * x + 1/1$

which is identical

$1+ x + x ^ 2 + x ^ 3$

I note that creating an optimal grammar is a non-trivial and very important task.

I hope that now it has become clear what kind of “grammatical evolution” this method is and how to use it for solving problems. An interesting fact is that symbolic regression is applicable not only for functions. We can create instructions for robots, as well as build the structure of a neural network and configure parameters in it. We no longer need the method of back propagation of the error for training the network. Together with it, one can also abandon the activation functions with a continuous first derivative. This is the so-called "neuroevolutionary programming."

Once again, I give a link to the source . If anyone wants to understand the method more deeply.

Tags:

What is grammatical evolution + easy implementation

Bakus-Naur Form

Genetic algorithm

Grammatical evolution

Also popular now: