About the book

Robot with transparent background A Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions.

The book is freely downloadable under a Creative Commons license as a PDF and low cost printed copies can be purchased from lulu.com. This web site will be used for information relating to the book, and other pertinent announcements. We also have a discussion group for questions and conversation related to the book.

Saturday 31 May 2008

10,000 downloads

Yesterday, the field guide reached 10,000 downloads, just over two months from going live, averaging 150 downloads a day. Of course, as the graph at the bottom of the blog shows, many downloads happened in the first few weeks. So, in the last few weeks we seem to be averaging something like 50 downloads a day. Pretty amazing nonetheless! Many thanks!

Thursday 8 May 2008

The Field Guide goes HTML

We have produced an HTML version of the book. This is available here. We used htlatex to do the main conversion from Latex to HTML and then htmldoc to split the resulting huge file into chuncks of more reasonable size. Many manual changes have then been necessary. The result is not perfect, but we think it may be good enough to be useful.

Happy browsing!

Tuesday 6 May 2008

The Field Guide is available on Amazon!

Over the weekend the Field Guide apparently completed its complex journey into the land of major on-line retailers and is now available via Amazon and several other on-line vendors.

Bug in TinyGP Java Code

Sander Land found a bug in the constructor "tiny_gp" of the TinyGP system described in appendix B: the "x" array is initialized (lines 345-346) after the population is (line 344), causing invalid fitness evaluations during the initialization of the population, thereby wasting the first generation. The mistakes has been fixed in the code available here. Many thanks Sander (and apologies to all users).

Friday 2 May 2008

Table of Contents

1 Introduction 1
1.1 Genetic Programming in a Nutshell . . . . . . . . . . . . . . . 2
1.2 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Overview of this Field Guide . . . . . . . . . . . . . . . . . . 4

Part I Basics 7

2 Representation, Initialisation and Operators in Tree-based GP 9
2.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Initialising the Population . . . . . . . . . . . . . . . . . . . . 11
2.3 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Recombination and Mutation . . . . . . . . . . . . . . . . . . 15

3 Getting Ready to Run Genetic Programming 19
3.1 Step 1: Terminal Set . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Step 2: Function Set . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.3 Evolving Structures other than Programs . . . . . . . 23
3.3 Step 3: Fitness Function . . . . . . . . . . . . . . . . . . . . . 24
3.4 Step 4: GP Parameters . . . . . . . . . . . . . . . . . . . . . 26
3.5 Step 5: Termination and solution designation . . . . . . . . . 27

4 Example Genetic Programming Run 29
4.1 Preparatory Steps . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Step-by-Step Sample Run . . . . . . . . . . . . . . . . . . . . 31
4.2.1 Initialisation . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.2 Fitness Evaluation . . . . . . . . . . . . . . . . . . . . 32
4.2.3 Selection, Crossover and Mutation . . . . . . . . . . . 32
4.2.4 Termination and Solution Designation . . . . . . . . . 35

Part II Advanced Genetic Programming 37

5 Alternative Initialisations and Operators in Tree-based GP 39
5.1 Constructing the Initial Population . . . . . . . . . . . . . . . 39
5.1.1 Uniform Initialisation . . . . . . . . . . . . . . . . . . 40
5.1.2 Initialisation may Affect Bloat . . . . . . . . . . . . . 40
5.1.3 Seeding . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 GP Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2.1 Is Mutation Necessary? . . . . . . . . . . . . . . . . . 42
5.2.2 Mutation Cookbook . . . . . . . . . . . . . . . . . . . 42
5.3 GP Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.4 Other Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Modular, Grammatical and Developmental Tree-based GP 47
6.1 Evolving Modular and Hierarchical Structures . . . . . . . . . 47
6.1.1 Automatically Defined Functions . . . . . . . . . . . . 48
6.1.2 Program Architecture and Architecture-Altering . . . 50
6.2 Constraining Structures . . . . . . . . . . . . . . . . . . . . . 51
6.2.1 Enforcing Particular Structures . . . . . . . . . . . . . 52
6.2.2 Strongly Typed GP . . . . . . . . . . . . . . . . . . . 52
6.2.3 Grammar-based Constraints . . . . . . . . . . . . . . . 53
6.2.4 Constraints and Bias . . . . . . . . . . . . . . . . . . . 55
6.3 Developmental Genetic Programming . . . . . . . . . . . . . 57
6.4 Strongly Typed Autoconstructive GP with PushGP . . . . . 59

7 Linear and Graph Genetic Programming 61
7.1 Linear Genetic Programming . . . . . . . . . . . . . . . . . . 61
7.1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . 61
7.1.2 Linear GP Representations . . . . . . . . . . . . . . . 62
7.1.3 Linear GP Operators . . . . . . . . . . . . . . . . . . . 64
7.2 Graph-Based Genetic Programming . . . . . . . . . . . . . . 65
7.2.1 Parallel Distributed GP (PDGP) . . . . . . . . . . . . 65
7.2.2 PADO . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2.3 Cartesian GP . . . . . . . . . . . . . . . . . . . . . . . 67
7.2.4 Evolving Parallel Programs using Indirect Encodings . 68

8 Probabilistic Genetic Programming 69
8.1 Estimation of Distribution Algorithms . . . . . . . . . . . . . 69
8.2 Pure EDA GP . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.3 Mixing Grammars and Probabilities . . . . . . . . . . . . . . 74

9 Multi-objective Genetic Programming 75
9.1 Combining Multiple Objectives into a Scalar Fitness Function 75
9.2 Keeping the Objectives Separate . . . . . . . . . . . . . . . . 76
9.2.1 Multi-objective Bloat and Complexity Control . . . . 77
9.2.2 Other Objectives . . . . . . . . . . . . . . . . . . . . . 78
9.2.3 Non-Pareto Criteria . . . . . . . . . . . . . . . . . . . 80
9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80
9.4 Multi-objective Optimisation via Operator Bias . . . . . . . . 81

10 Fast and Distributed Genetic Programming 83
10.1 Reducing Fitness Evaluations/Increasing their Effectiveness . 83
10.2 Reducing Cost of Fitness with Caches . . . . . . . . . . . . . 86
10.3 Parallel and Distributed GP are Not Equivalent . . . . . . . . 88
10.4 Running GP on Parallel Hardware . . . . . . . . . . . . . . . 89
10.4.1 Masterslave GP . . . . . . . . . . . . . . . . . . . . . 89
10.4.2 GP Running on GPUs . . . . . . . . . . . . . . . . . . 90
10.4.3 GP on FPGAs . . . . . . . . . . . . . . . . . . . . . . 92
10.4.4 Sub-machine-code GP . . . . . . . . . . . . . . . . . . 93
10.5 Geographically Distributed GP . . . . . . . . . . . . . . . . . 93

11 GP Theory and its Applications 97
11.1 Mathematical Models . . . . . . . . . . . . . . . . . . . . . . 98
11.2 Search Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
11.3 Bloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.3.1 Bloat in Theory . . . . . . . . . . . . . . . . . . . . . 101
11.3.2 Bloat Control in Practice . . . . . . . . . . . . . . . . 104

Part III Practical Genetic Programming 109

12 Applications 111
12.1 Where GP has Done Well . . . . . . . . . . . . . . . . . . . . 111
12.2 Curve Fitting, Data Modelling and Symbolic Regression . . . 113
12.3 Human Competitive Results the Humies . . . . . . . . . . . 117
12.4 Image and Signal Processing . . . . . . . . . . . . . . . . . . . 121
12.5 Financial Trading, Time Series, and Economic Modelling . . 123
12.6 Industrial Process Control . . . . . . . . . . . . . . . . . . . . 124
12.7 Medicine, Biology and Bioinformatics . . . . . . . . . . . . . 125
12.8 GP to Create Searchers and Solvers Hyper-heuristics . . . . 126
12.9 Entertainment and Computer Games . . . . . . . . . . . . . . 127
12.10The Arts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
12.11Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

13 Troubleshooting GP 131
13.1 Is there a Bug in the Code? . . . . . . . . . . . . . . . . . . . 131
13.2 Can you Trust your Results? . . . . . . . . . . . . . . . . . . 132
13.3 There are No Silver Bullets . . . . . . . . . . . . . . . . . . . 132
13.4 Small Changes can have Big Effects . . . . . . . . . . . . . . 133
13.5 Big Changes can have No Effect . . . . . . . . . . . . . . . . 133
13.6 Study your Populations . . . . . . . . . . . . . . . . . . . . . 134
13.7 Encourage Diversity . . . . . . . . . . . . . . . . . . . . . . . 136
13.8 Embrace Approximation . . . . . . . . . . . . . . . . . . . . . 137
13.9 Control Bloat . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
13.10Checkpoint Results . . . . . . . . . . . . . . . . . . . . . . . . 139
13.11Report Well . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
13.12Convince your Customers . . . . . . . . . . . . . . . . . . . . 140

14 Conclusions 141

Part IV Tricks of the Trade 143

A Resources 145
A.1 Key Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
A.2 Key Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
A.3 Key International Meetings . . . . . . . . . . . . . . . . . . . 147
A.4 GP Implementations . . . . . . . . . . . . . . . . . . . . . . . 147
A.5 On-Line Resources . . . . . . . . . . . . . . . . . . . . . . . . 148

B TinyGP 151
B.1 Overview of TinyGP . . . . . . . . . . . . . . . . . . . . . . . 151
B.2 Input Data Files for TinyGP . . . . . . . . . . . . . . . . . . 153
B.3 Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
B.4 Compiling and Running TinyGP . . . . . . . . . . . . . . . . 162

Bibliography 167

Index 225