In order to evaluate the fitness function the candidate solution has to be simulated. This step involves the interpretation of a CGP genotype for each vector. Since the fitness evaluation time dominates the time of whole evolutionary process in vast majority of problems and moreover a large number of evaluations is usually needed to achieve satisfactory result, it is important to simulate candidate solutions effectively.
In order to maximize the overall performance, we have replaced the interpreter with corresponding optimized native machine code that directly calculates response for a single training vector. The machine code is in linear form, i.e., without conditional branches hurting performance. We have developed a method which allows the users to significantly accelerate the evaluation of candidate solutions without having hardly any knowledge of assembly language or target machine code. Moreover, the integration of the machine code compiler requires modifying only a few lines of code.
As it can be seen on a simple case study CGP can process several times higher number of evaluations on the same machine introducing native evaluation. Even if we used optimized C implementation of CGP, we were able to evaluate more than five times higher number of candidate solutions using the same machine. Even when we used a highly optimized C implementation of the interpreted CGP, we were still able to evaluate more than five times greater amount of candidate solutions by using the native code version.
Details can be found in our paper Efficient Phenotype Evaluation in Cartesian Genetic Programming presented at EuroGP 2012. This paper contains evaluation in fixed as well as floating-point domain where one can achieve even better results.
The process of fitness value calculation consists of two phases. In the first phase the CGP chromosome is compiled or more precisely translated to machine code that resides in the application's address space. The second phase involves the calculation of response for the training vectors and fitness value.
The whole process is designed to maximize the performance a minimize overhead caused by translation. Thus the evaluation procedure consists of a simple call only which executes the obtained machine code for each training vector. It means that the process of compilation is executed once per fitness evaluation.
To integrate the native evaluation one has to modify function that evaluates fitness value of the whole population. The modification consists of calling code compiler and replacing the common interpreter with simple call that invokes correct compiled machine code. Apart from that it is of course necessary to initialize the arrays utilised by the native evaluation.
The changes you need to make are highlighted in the following code snippet.
+ #include "native.h" ... + //code generated using this website + unsigned char* consts; + void cgp_init_consts() + { + ... + } + unsigned char* code[MAX_POPSIZE]; + #define MAXCODESIZE 20 + void cgp_compile(unsigned char* pcode, chromosome p_chrom, int* isused) + { + ... + } void calc_fitness(int parentidx) { //variable declarations ... //initialize for (int i=0; i < params.popsize; i++) { if (i == parentidx) continue; #ifdef DONOTEVALUATEUNUSEDNODES usednodes[i] = used_nodes((chromosome) population[i], isused[i]); #endif fitvalues[i] = 0; + cgp_compile(code[i], (chromosome) population[i], isused[i]); } for (int l=0; l < params.trainingvectors; l++) { //copy the first part of a training vector to the primary inputs memcpy(nodeoutput, ptraindata, params.inputs*sizeof(nodetype)); ptraindata += params.inputs; //determine and process the response of each candidate solution for (int i=0; i < params.popsize; i++) { if (i == parentidx) continue; - cgp_eval((chromosome) population[i], isused[i]); + ((evalfunc *)(code[i]))(); //compare the output values against training vector (calculate fit value) ... fitvalues[i] += fit; } //next training vector ptraindata += params.outputs; } } ... int main(int argc, char* argv[]) { ... //memory allocation for (int i=0; i < params.popsize; i++) { population[i] = new chromosome [params.geneoutidx + params.outputs]; isused[i] = new int [params.genes]; + code[i] = malloc_aligned(params.cols*params.rows*MAXCODESIZE + 64); } + cgp_init_consts(); ... }
If you don't believe that the proposed acceleration technique is really simple to integrate then just compare source codes which are available in the next section.
The impact of the proposed method on the performance is evaluated using one of common CGP applications - evolutionary design of combinational circuits. Even if software implementations of CGP (intended for evolution of logic circuits) can strongly benefit from the so-called parallel simulation we can easily demonstrate that the performance can be increased by introducing native evaluation.
We have implemented common CGP and accelerated version of CGP and evaluated the performance using common benchmark problem - evolutionary design of multipliers at gate-level. The tables below show the average number of evaluations evaluated per a second. The average has been calculated using twenty independent runs. Note that the flag DONOTEVALUATEUNUSEDNODES was not set to provide statistically significant results.
Hamming distance utilized in fitness function was calculated using either four lookups into 8-bit LUT table (LUT) or 32-bit popcnt instruction available in SSE4 (POPCNT)
benchmark | accelerated CGP implementation | common CGP implementation | ||
---|---|---|---|---|
LUT | POPCNT | LUT | POPCNT | |
multiplier 5x5 | 71 000(5.3x) | 78 568(5.7x) | 13 318 | 13 675 |
multiplier 6x6 | 22 432(6.8x) | 26 137(7.7x) | 3 290 | 3 391 |
multiplier 7x7 | 5 576(6.9x) | 6 657(7.9x) | 808 | 842 |
CGP parameters: C=80, R=1, L=40 Node functions: BUF, NOT, AND, OR, XOR, NAND, NOR, XNOR CPU: Intel Core 2 DUO P8400 @2.26 GHz OS: Win XP, 32-bit |
The speedup vary between 5x and 8x depending on the problem size. The larger number of training vectors, the higher speedup. This effect is primarily caused by the overhead introduced by the compilation of CGP phenotypes into machine code.
Hamming distance utilized in fitness function was calculated using either eight lookups into 8-bit LUT table (LUT) or 64-bit popcnt instruction available in SSE4 (POPCNTL)
benchmark | accelerated CGP implementation | common CGP implementation | ||
---|---|---|---|---|
LUT | POPCNT | LUT | POPCNT | |
multiplier 5x5 | 107 051(3.3x) | 136 449(3.8x) | 32 677 | 36 016 |
multiplier 6x6 | 40 318(5.0x) | 62 794(6.9x) | 8 126 | 9 123 |
multiplier 7x7 | 10 614(5.3x) | 19 137(8.5x) | 1 992 | 2 246 |
multiplier 8x8 | 2 442(5.0x) | 4 714(8.4x) | 492 | 560 |
CGP parameters: C=80, R=1, L=40 Node functions: BUF, NOT, AND, OR, XOR, NAND, NOR, XNOR CPU: Intel Xeon X5650 @2.66 GHz (12 cores, 15 MB cache) OS: CentOS Linux, 64-bit |
Similarily to the previous results the speedup vary between 3x and 8x depending on the problem size. If we compare the average number of evaluations evaluated per a second of the 32-bit and 64-bit common CGP implementation, the resulting ratio is approximately 2.4 which corresponds with our expectation as a) the 64-bit implementation evaluates 64 inputs vectors instead of 32 and b) the 64-bit CPU is running at approx. 1.17 times higher frequency.
Available code is licensed under BUT Open Source Licence
cgp.cpp | |
---|---|
Common CGP implementation (cgp.cpp)Cartesian genetic programming, introduced by Julian Miller and Peter Thomson in 2000, is a variant of genetic programming where the genotype is represented as a list of integers that are mapped to directed oriented graphs. In order to evaluate the fitness function the response for each training vector has to be calculated. This step involves the interpretation of a CGP genotype for each vector. In this implementation we are using linear interpreter that interprets CGP chromosome. | /*===============================================================================
cgp.cpp: Evolutionary design of combinational circuits using CGP
===============================================================================
Copyright (C) 2012 Brno University of Technology,
Faculty of Information Technology
Author(s): Zdenek Vasicek <vasicek AT fit.vutbr.cz>
=============================================================================*/
/// #define HAVE_POPCNT
/// #define DONOTEVALUATEUNUSEDNODES
#define MAX_POPSIZE 100
typedef int fitvaltype;
typedef int nodetype;
#include "cgp.h"
#include "circ.h"
pchromosome population[MAX_POPSIZE];
fitvaltype fitvalues[MAX_POPSIZE];
/// default parameters
tparams params = {5000000 /*generations*/, 5 /*pop.size*/, 1 /*mut.genes*/, 0, 0, 80 /*cols*/, 1 /*rows*/, 40 /*lback*/, 2, 1, 8 /*functions*/};
nodetype* nodeoutput; //array of node outputs used for CGP evaluation
int* isused[MAX_POPSIZE]; //array of marked used nodes for each individual
int usednodes[MAX_POPSIZE]; //number of used nodes for each individual
int* data; //training data
col_validvalues **col_values; //valid gene values for each column
|
CGP mutation operatorThis function modifies The implementation of the mutation operator has to ensure that the modifications are legal and lead to a valid phenotype.
This is done using |
inline void cgp_mutate(chromosome p_chrom)
{
int rnd = rand() % params.mutations;
int genes = rnd + 1;
for (int j = 0; j < genes; j++)
{
int i = rand() % (params.geneoutidx + params.outputs);
int col = (int) (i / params.genespercol);
if (i < params.geneoutidx)
{
if ((i % params.nodeios) < params.nodeinputs)
{ //mutate a gene that encodes connection
do { rnd = col_values[col]->values[(rand() % (col_values[col]->items))]; } while (rnd == p_chrom[i]);
p_chrom[i] = rnd;
}
else
{ //mutate a gene that encodes function
if (params.nodefuncs > 1) {
do { rnd = rand() % params.nodefuncs; } while (rnd == p_chrom[i]);
p_chrom[i] = rnd;
}
}
}
else
{ //mutate a primary output
do { rnd = rand() % params.lastnodeidx; } while (rnd == p_chrom[i]);
p_chrom[i] = rnd;
}
}
}
|
CGP evaluationThis function simulates the encoded candidate solution and calculates
response for single input vector. The response for each CGP node is
stored in array The interpreter consists of a loop that calculates the response for each CGP
node according to the genes of CGP chromosome. The execution of the encoded graph
starts from the first node and continues according to the increasing node index.
This scheme represents the most efficient
implementation as it does not introduce any overhead due to function calling that
have to manipulate with stack. In contrast with the recursive approach, all the
output values are calculated in one pass. In order to improve the performance,
a simple preprocessing step that marks the utilized nodes only can be introduced
(by enabling |
inline int cgp_eval(chromosome p_chrom, int* isused)
{
int in1,in2,fce,i,j;
int *pnodeout = nodeoutput + params.inputs;
#ifdef DONOTEVALUATEUNUSEDNODES
isused += params.inputs;
#endif
/// Evaluate the response of each node
for (i=0; i < params.cols; i++)
for (j=0; j < params.rows; j++)
{
#ifdef DONOTEVALUATEUNUSEDNODES
if (!*isused++)
{ //This node is not used, skip it
p_chrom += 3;
pnodeout++;
continue;
}
#endif
in1 = nodeoutput[*p_chrom++];
in2 = nodeoutput[*p_chrom++];
fce = *p_chrom++;
switch (fce)
{
case 0: *pnodeout++ = in1; break; //in1
case 1: *pnodeout++ = ~in1; break; //not in1
case 2: *pnodeout++ = in1 & in2; break; //and
case 3: *pnodeout++ = in1 | in2; break; //or
case 4: *pnodeout++ = in1 ^ in2; break; //xor
case 5: *pnodeout++ = ~(in1 & in2); break; //nand
case 6: *pnodeout++ = ~(in1 | in2); break; //nor
case 7: *pnodeout++ = ~(in1 ^ in2); break; //xnor
default: abort();
}
}
}
|
Calculate fitness valuesThis function determines the fitness value of each candidate solution except
parental solution In order to calculate the fitness value of each candidate solution the response for each training data has to be evaluated. In this implementation the fitness value correspons with the Hamming distance between the calculated and desired response. The fitness function calls |
inline void calc_fitness(int parentidx)
{
chromosome p_chrom;
register int fit, vysl;
int *ptraindata = data;
for (int i=0; i < params.popsize; i++)
{
if (i == parentidx) continue;
#ifdef DONOTEVALUATEUNUSEDNODES
usednodes[i] = used_nodes((chromosome) population[i], isused[i]);
#endif
fitvalues[i] = 0;
}
for (int l=0; l < params.trainingvectors; l++)
{
///copy the first part of a training vector to the primary inputs
memcpy(nodeoutput, ptraindata, params.inputs*sizeof(nodetype));
ptraindata += params.inputs;
///determine and check response of each candidate solution
for (int i=0; i < params.popsize; i++)
{
if (i == parentidx) continue;
cgp_eval((chromosome) population[i], isused[i]);
///compare the output values against training vector (specification)
fit = 0;
p_chrom = (chromosome) population[i] + params.geneoutidx;
for (int k=0; k < params.outputs; k++)
{
vysl = (nodeoutput[*p_chrom++] ^ *(ptraindata +k));
fit += zeroscount(vysl);
}
fitvalues[i] += fit;
}
///next training vector
ptraindata += params.outputs;
}
}
|
Number of utilized nodesThis function calculates the number of nodes that contribute to the resulting phenotype that is encoded by a given genotype. |
int used_nodes(chromosome p_chrom, int* isused)
{
int in, fce, idx, used = 0;
int *pchrom;
memset(isused, 0, params.lastnodeidx*sizeof(int));
|
mark nodes connected to the primary outputs |
pchrom = p_chrom + params.geneoutidx;
for (int i=0; i < params.outputs; i++)
isused[*pchrom++] = 1;
|
go throught the cgp array |
pchrom = p_chrom + params.geneoutidx - 1;
idx = params.lastnodeidx-1;
for (int i=params.cols; i > 0; i--)
{
for (int j=params.rows; j > 0; j--,idx--)
{
fce = *pchrom--; //fce
if (isused[idx] == 1)
{
|
the current node is marked, mark also the connected nodes |
in = *pchrom--; // in2
if (fce > 1) // 2-input functions
isused[in] = 1;
in = *pchrom--; // in1
isused[in] = 1;
used++;
} else {
|
the current node is not market, skip it |
pchrom -= params.nodeinputs;
}
}
}
return used;
}
|
Main applicationEvolutionary design of combinational circuits using Cartesian Genetic Programming. Usage: |
int main(int argc, char* argv[])
{
using namespace std;
int blk, bestblk, data_items, parentidx, fittest_idx;
unsigned long int generation;
fitvaltype bestfitval;
strcpy(params.datafname, "data.txt");
parse_options(argc, argv);
///load training data
if ((data_items = parsefile(params.datafname, NULL, NULL, NULL)) < 1)
{
printf("Invalid data\n"); return 0;
}
data = new int[data_items];
parsefile(params.datafname, data, ¶ms.inputs, ¶ms.outputs);
params.trainingvectors = data_items / (params.inputs+params.outputs); //Spocitani poctu pruchodu pro ohodnoceni
params.maxfitval = params.trainingvectors*params.outputs*32; //Vyitems max. fitness
printf("Training data:\n file:%s, items:%d, inputs:%d, outputs:%d\n", params.datafname, data_items, params.inputs, params.outputs);
init_paramsandluts();
///memory allocation
for (int i=0; i < params.popsize; i++)
{
population[i] = new chromosome [params.geneoutidx + params.outputs];
isused[i] = new int [params.genes];
}
nodeoutput = new nodetype [params.genes];
memset(nodeoutput, 0, sizeof(nodetype) * params.genes);
srand((unsigned) time(NULL));
|
Initial population and its evaluationThe initial population which consists of |
for (int i=0; i < params.popsize; i++)
{
chromosome p_chrom = (chromosome) population[i];
for (int j=0; j < params.cols*params.rows; j++)
{
int col = (int)(j / params.rows);
for (int k=0; k < params.nodeinputs; k++) // node inputs
*p_chrom++ = col_values[col]->values[(rand() % (col_values[col]->items))];
*p_chrom++ = rand() % params.nodefuncs; // node function
}
for (int j=0; j < params.outputs; j++) // primary outputs
*p_chrom++ = rand() % params.lastnodeidx;
}
|
Evaluate the initial population and find the fittest candidate solution that becomes parent |
calc_fitness(-1);
bestfitval = fitvalues[0]; fittest_idx = 0;
for (int i=1; i < params.popsize; i++)
if (fitvalues[i] > bestfitval)
{
bestfitval = fitvalues[i];
fittest_idx = i;
}
printf("CGP parameters:\n l-back=%d, rows=%d, cols=%d, functions=%d\n", params.lback, params.rows, params.cols, params.nodefuncs);
printf(" popsize=%d, mutgenes=%d, generations=%d\n", params.popsize, params.mutations, params.maxgenerations);
#ifdef DONOTEVALUATEUNUSEDNODES
printf(" evalunused=false\n");
#endif
printf("Evolutionary run:\n initial fitness=%d, best fitness=%d\n", bestfitval, params.maxfitval);
double time = cpuTime();
|
Evolutionary loop |
for (int generation = 0; generation < params.maxgenerations; generation++)
{
|
Step 1Generate offsprings of the fittest individual |
for (int i=0; i < params.popsize; i++)
{
if (fittest_idx == i) continue;
cgp_mutate(copy_chromozome(population[fittest_idx], population[i]));
}
|
Step 2Evaluate population and calculate fitness values |
calc_fitness(fittest_idx);
|
Step 3Check if there is an offspring that should replace parental solution |
int newparidx = fittest_idx;
for (int i=0; i < params.popsize; i++)
{
fitvaltype fit = fitvalues[i];
if ((i == fittest_idx) || (fit < bestfitval)) continue;
|
current individual is at least of the same quality as its parent |
if (fit < params.maxfitval)
{ //current individual does not
if (fit > bestfitval)
printf("Generation: %-8d\tFitness: %d/%d\n",generation, fit, params.maxfitval);
bestfitval = fit;
bestblk = params.nodes;
newparidx = i;
}
else
{ //second optimization criterion - the number of utilized nodes
#ifdef DONOTEVALUATEUNUSEDNODES
blk = usednodes[i];
#else
blk = used_nodes((chromosome) population[i], isused[i]);
#endif
if (blk <= bestblk)
{
if ((blk < bestblk) || (fit > bestfitval))
printf("Generation: %-8d\tFitness: %d/%d\tNodes: %d\n",generation, fit, params.maxfitval, blk);
bestfitval = fit;
bestblk = blk;
newparidx = i;
}
}
}
fittest_idx = newparidx;
}
|
End of evolution |
time = cpuTime() - time;
printf("Best fitness: %d/%d\n",bestfitval,params.maxfitval);
printf("Best individual: ");
print_chrom(stdout, (chromosome)population[fittest_idx]);
printf("Duration: %f Evaluations per sec: %f", time, params.maxgenerations*(params.popsize-1) / time);
if (bestfitval == params.maxfitval) {
|
save the solution |
FILE *chrfil = fopen("result.chr","wb");
print_chrom(chrfil, (chromosome)population[fittest_idx]);
fclose(chrfil);
}
return 0;
}
|
cgp_native.cpp | |
---|---|
Efficient CGP implementation (cgp_native.cpp)Cartesian genetic programming, introduced by Julian Miller and Peter Thomson in 2000, is a variant of genetic programming where the genotype is represented as a list of integers that are mapped to directed oriented graphs. In order to evaluate the fitness function, the response for each training vector has to be calculated. This step involves the interpretation of a CGP genotype for each vector. To maximize the overall performance, we introduced JIT compilation of genotype to binary machine code. | /*===============================================================================
cgp_native.cpp: Evolutionary design of combinational circuits using CGP
===============================================================================
Copyright (C) 2012 Brno University of Technology,
Faculty of Information Technology
Author(s): Zdenek Vasicek <vasicek AT fit.vutbr.cz>
=============================================================================*/
/// #define HAVE_POPCNT
/// #define DONOTEVALUATEUNUSEDNODES
#define MAX_POPSIZE 100
typedef int fitvaltype;
typedef int nodetype;
#include "cgp.h"
#include "circ.h"
#include "native.h"
pchromosome population[MAX_POPSIZE];
fitvaltype fitvalues[MAX_POPSIZE];
///default parameters
tparams params = {5000000 /*generations*/, 5 /*pop.size*/, 1 /*mut.genes*/, 0, 0, 80 /*cols*/, 1 /*rows*/, 40 /*lback*/, 2, 1, 8 /*functions*/};
nodetype* nodeoutput; //array of node outputs used for CGP evaluation
int* isused[MAX_POPSIZE]; //array of marked used nodes for each individual
int usednodes[MAX_POPSIZE]; //number of used nodes for each individual
int* data; //training data
col_validvalues **col_values; //valid gene values for each column
|
CGP mutation operatorThis function modifies The implementation of the mutation operator has to ensure that the modifications are legal and lead to a valid phenotype.
This is done using |
inline void cgp_mutate(chromosome p_chrom)
{
int rnd = rand() % params.mutations;
int genes = rnd + 1;
for (int j = 0; j < genes; j++)
{
int i = rand() % (params.geneoutidx + params.outputs);
int col = (int) (i / params.genespercol);
if (i < params.geneoutidx)
{
if ((i % params.nodeios) < params.nodeinputs)
{ //mutate a gene that encodes connection
do { rnd = col_values[col]->values[(rand() % (col_values[col]->items))]; } while (rnd == p_chrom[i]);
p_chrom[i] = rnd;
}
else
{ //mutate a gene that encodes function
if (params.nodefuncs > 1) {
do { rnd = rand() % params.nodefuncs; } while (rnd == p_chrom[i]);
p_chrom[i] = rnd;
}
}
}
else
{ //mutate a primary output
do { rnd = rand() % params.lastnodeidx; } while (rnd == p_chrom[i]);
p_chrom[i] = rnd;
}
}
}
unsigned char* consts;
void cgp_init_consts()
{
#define B(val) *(pconst++)=(val)
unsigned char* pconst;
pconst = consts = malloc_aligned(1024);
///Initialize constants
}
#define MAXCODESIZE 20
typedef void evalfunc(void);
unsigned char* code[MAX_POPSIZE];
|
Native code generationThe process of CGP genotype compilation contains three sections. |
int cgp_compile(unsigned char* pcode, chromosome p_chrom, int* isused)
{
#define C(val) *(pcode++)=(val)
#define CI(val) {*((uint32_t *)(pcode))= (uint32_t)(val); pcode += sizeof(uint32_t);}
int pnodeout = 0;
int in1,in2,fce;
int out = params.inputs - 1;
|
Initialize stackThe code that preserves the utilized registers and handles the stack is generated in the first section. |
C(0x53); //push ebx
/// Load chromosome ptr to ebx
C(0xbb);CI(nodeoutput);
|
Native code generationThen, the CGP chromosome is interpreted and corresponding machine code is generated. The procedure that translates CGP chromosome to the machine code looks nearly the same as the structure of linear CGP interpreter. Instead of evaluation of each node, corresponding machine code is generated. |
for (int i=0; i < params.cols; i++)
for (int j=0; j < params.rows; j++)
{
in1 = *p_chrom++; in2 = *p_chrom++; fce = *p_chrom++; out++;
#ifdef DONOTEVALUATEUNUSEDNODES
if (!isused[out]) continue;
#endif
|
The machine code for the given set of primitive functions was obtained using provided online generator.
Register |
switch (fce)
{
case 0: // BUFFER
/// data[out] = data[in1];
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 12, stack requirements: 0
///registers: %ebx, %eax
break;
case 1: // NOT
///data[out] = ~data[in1];
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0xf7);C(0xd0); //not %eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 14, stack requirements: 0
///registers: %ebx, %eax
break;
case 2: // AND
///data[out] = data[in1] & data[in2];
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x23);C(0x83);CI(4*in2); //and in2(%ebx),%eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 18, stack requirements: 0
///registers: %ebx, %eax
break;
case 3: // OR
///data[out] = data[in1] | data[in2];
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x0b);C(0x83);CI(4*in2); //or in2(%ebx),%eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 18, stack requirements: 0
///registers: %ebx, %eax
break;
case 4: // XOR
///data[out] = data[in1] ^ data[in2];
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x33);C(0x83);CI(4*in2); //xor in2(%ebx),%eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 18, stack requirements: 0
///registers: %ebx, %eax
break;
case 5: // NAND
///data[out] = ~(data[in1] & data[in2]);
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x23);C(0x83);CI(4*in2); //and in2(%ebx),%eax
C(0xf7);C(0xd0); //not %eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 20, stack requirements: 0
///registers: %ebx, %eax
break;
case 6: // NOR
///data[out] = ~(data[in1] | data[in2]);
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x0b);C(0x83);CI(4*in2); //or in2(%ebx),%eax
C(0xf7);C(0xd0); //not %eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 20, stack requirements: 0
///registers: %ebx, %eax
break;
case 7: // XNOR
///data[out] = ~(data[in1] ^ data[in2]);
C(0x8b);C(0x83);CI(4*in1); //mov in1(%ebx),%eax
C(0x33);C(0x83);CI(4*in2); //xor in2(%ebx),%eax
C(0xf7);C(0xd0); //not %eax
C(0x89);C(0x83);CI(4*out); //mov %eax,out(%ebx)
///code size: 20, stack requirements: 0
///registers: %ebx, %eax
break;
default:
abort();
}
}
|
Finalize stack and returnFinally, the instructions restoring the values of previously stored registers are generated. |
C(0x5B); //pop ebx
C(0xc3); //ret
}
|
Calculate fitness valuesThis function determines the fitness value of each candidate solution except
parental solution The process of fitness function calculation consists of two phases.
In the first phase, the CGP chromosome is translated to machine code that resides
in the application's address space. The process of translation exhibits linear
time complexity with respect to the number of CGP nodes. The second phase involves
the calculation of response for the training vectors and fitness value.
This procedure executes the obtained machine code stored in array |
inline void calc_fitness(int parentidx)
{
chromosome p_chrom;
register int fit, vysl;
int *ptraindata = data;
///generate native code for each candidate solution
for (int i=0; i < params.popsize; i++)
{
if (i == parentidx) continue;
#ifdef DONOTEVALUATEUNUSEDNODES
usednodes[i] = used_nodes((chromosome) population[i], isused[i]);
#endif
cgp_compile(code[i], (chromosome) population[i], isused[i]);
fitvalues[i] = 0;
}
for (int l=0; l < params.trainingvectors; l++)
{
///copy the first part of a training vector to the primary inputs
memcpy(nodeoutput, ptraindata, params.inputs*sizeof(nodetype));
ptraindata += params.inputs;
///determine and check response of each candidate solution
for (int i=0; i < params.popsize; i++)
{
if (i == parentidx) continue;
/// cgp_eval((chromosome) population[i], isused[i]);
((evalfunc *)(code[i]))();
///compare the output values against training vector (specification)
fit = 0;
p_chrom = (chromosome) population[i] + params.geneoutidx;
for (int k=0; k < params.outputs; k++)
{
vysl = (nodeoutput[*p_chrom++] ^ *(ptraindata +k));
fit += zeroscount(vysl);
}
fitvalues[i] += fit;
}
///next training vector
ptraindata += params.outputs;
}
}
|
Number of utilized nodesThis function calculates the number of nodes that contribute to the resulting phenotype that is encoded by a given genotype. |
int used_nodes(chromosome p_chrom, int* isused)
{
int in, fce, idx, used = 0;
int *pchrom;
memset(isused, 0, params.lastnodeidx*sizeof(int));
|
mark nodes connected to the primary outputs |
pchrom = p_chrom + params.geneoutidx;
for (int i=0; i < params.outputs; i++)
isused[*pchrom++] = 1;
|
go throught the cgp array |
pchrom = p_chrom + params.geneoutidx - 1;
idx = params.lastnodeidx-1;
for (int i=params.cols; i > 0; i--)
{
for (int j=params.rows; j > 0; j--,idx--)
{
fce = *pchrom--; //fce
if (isused[idx] == 1)
{
|
the current node is marked, mark also the connected nodes |
in = *pchrom--; // in2
if (fce > 1) // 2-input functions
isused[in] = 1;
in = *pchrom--; // in1
isused[in] = 1;
used++;
} else {
|
the current node is not market, skip it |
pchrom -= params.nodeinputs;
}
}
}
return used;
}
|
Main applicationEvolutionary design of combinational circuits using Cartesian Genetic Programming (32-bit i686 implementation). Usage: |
int main(int argc, char* argv[])
{
using namespace std;
int blk, bestblk, data_items, parentidx, fittest_idx;
unsigned long int generation;
fitvaltype bestfitval;
strcpy(params.datafname, "data.txt");
parse_options(argc, argv);
///load training data
if ((data_items = parsefile(params.datafname, NULL, NULL, NULL)) < 1)
{
printf("Invalid data\n"); return 0;
}
data = new int[data_items];
parsefile(params.datafname, data, ¶ms.inputs, ¶ms.outputs);
params.trainingvectors = data_items / (params.inputs+params.outputs); //Spocitani poctu pruchodu pro ohodnoceni
params.maxfitval = params.trainingvectors*params.outputs*32; //Vyitems max. fitness
printf("Training data:\n file:%s, items:%d, inputs:%d, outputs:%d\n", params.datafname, data_items, params.inputs, params.outputs);
init_paramsandluts();
///memory allocation
for (int i=0; i < params.popsize; i++)
{
population[i] = new chromosome [params.geneoutidx + params.outputs];
isused[i] = new int [params.genes];
code[i] = malloc_aligned(params.cols*params.rows*MAXCODESIZE + 64);
}
nodeoutput = new nodetype [params.genes];
memset(nodeoutput, 0, sizeof(nodetype) * params.genes);
cgp_init_consts();
srand((unsigned) time(NULL));
|
Initial population and its evaluationThe initial population which consists of |
for (int i=0; i < params.popsize; i++)
{
chromosome p_chrom = (chromosome) population[i];
for (int j=0; j < params.cols*params.rows; j++)
{
int col = (int)(j / params.rows);
for (int k=0; k < params.nodeinputs; k++) // node inputs
*p_chrom++ = col_values[col]->values[(rand() % (col_values[col]->items))];
*p_chrom++ = rand() % params.nodefuncs; // node function
}
for (int j=0; j < params.outputs; j++) // primary outputs
*p_chrom++ = rand() % params.lastnodeidx;
}
|
Evaluate the initial population and find the fittest candidate solution that becomes parent |
calc_fitness(-1);
bestfitval = fitvalues[0]; fittest_idx = 0;
for (int i=1; i < params.popsize; i++)
if (fitvalues[i] > bestfitval)
{
bestfitval = fitvalues[i];
fittest_idx = i;
}
printf("CGP parameters:\n l-back=%d, rows=%d, cols=%d, functions=%d\n", params.lback, params.rows, params.cols, params.nodefuncs);
printf(" popsize=%d, mutgenes=%d, generations=%d\n", params.popsize, params.mutations, params.maxgenerations);
#ifdef DONOTEVALUATEUNUSEDNODES
printf(" evalunused=false\n");
#endif
printf("Evolutionary run:\n initial fitness=%d, best fitness=%d\n", bestfitval, params.maxfitval);
double time = cpuTime();
|
Evolutionary loop |
for (int generation = 0; generation < params.maxgenerations; generation++)
{
|
Step 1Generate offsprings of the fittest individual |
for (int i=0; i < params.popsize; i++)
{
if (fittest_idx == i) continue;
cgp_mutate(copy_chromozome(population[fittest_idx], population[i]));
}
|
Step 2Evaluate population and calculate fitness values |
calc_fitness(fittest_idx);
|
Step 3Check if there is an offspring that should replace parental solution |
int newparidx = fittest_idx;
for (int i=0; i < params.popsize; i++)
{
fitvaltype fit = fitvalues[i];
if ((i == fittest_idx) || (fit < bestfitval)) continue;
|
current individual is at least of the same quality as its parent |
if (fit < params.maxfitval)
{ //current individual does not
if (fit > bestfitval)
printf("Generation: %-8d\tFitness: %d/%d\n",generation, fit, params.maxfitval);
bestfitval = fit;
bestblk = params.nodes;
newparidx = i;
}
else
{ //second optimization criterion - the number of utilized nodes
#ifdef DONOTEVALUATEUNUSEDNODES
blk = usednodes[i];
#else
blk = used_nodes((chromosome) population[i], isused[i]);
#endif
if (blk <= bestblk)
{
if ((blk < bestblk) || (fit > bestfitval))
printf("Generation: %-8d\tFitness: %d/%d\tNodes: %d\n",generation, fit, params.maxfitval, blk);
bestfitval = fit;
bestblk = blk;
newparidx = i;
}
}
}
fittest_idx = newparidx;
}
|
End of evolution |
time = cpuTime() - time;
printf("Best fitness: %d/%d\n",bestfitval,params.maxfitval);
printf("Best individual: ");
print_chrom(stdout, (chromosome)population[fittest_idx]);
printf("Duration: %f Evaluations per sec: %f", time, params.maxgenerations*(params.popsize-1) / time);
if (bestfitval == params.maxfitval) {
|
save the solution |
FILE *chrfil = fopen("result.chr","wb");
print_chrom(chrfil, (chromosome)population[fittest_idx]);
fclose(chrfil);
}
return 0;
}
|