Genetic representation

From HandWiki
Short description: Data structure and types for evolutionary computation

In computer programming, genetic representation is a way of presenting solutions/individuals in evolutionary computation methods. The term encompasses both the concrete data structures and data types used to realize the genetic material of the candidate solutions in the form of a genome, and the relationships between search space and problem space. In the simplest case, the search space corresponds to the problem space (direct representation).[1] The choice of problem representation is tied to the choice of genetic operators, both of which have a decisive effect on the efficiency of the optimization.[2][3] Genetic representation can encode appearance, behavior, physical qualities of individuals. Difference in genetic representations is one of the major criteria drawing a line between known classes of evolutionary computation.[4][5]

Terminology is often analogous with natural genetics. The block of computer memory that represents one candidate solution is called an individual. The data in that block is called a chromosome. Each chromosome consists of genes. The possible values of a particular gene are called alleles. A programmer may represent all the individuals of a population using binary encoding, permutational encoding, encoding by tree, or any one of several other representations.[6][7]

Representations in some popular evolutionary algorithms

Genetic algorithms (GAs) are typically linear representations;[8] these are often, but not always,[9][10][11] binary.[10] Holland's original description of GA used arrays of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size. This facilitates simple crossover operation. Depending on the application, variable-length representations have also been successfully used and tested in evolutionary algorithms (EA)[12][13] in general and genetic algorithms[14][15] in particular, although the implementation of crossover is more complex in this case.

Evolution strategy uses linear real-valued representations, e.g., an array of real values. It uses mostly gaussian mutation and blending/averaging crossover.[16]

Genetic programming (GP) pioneered tree-like representations and developed genetic operators suitable for such representations. Tree-like representations are used in GP to represent and evolve functional programs with desired properties.[17]

Human-based genetic algorithm (HBGA) offers a way to avoid solving hard representation problems by outsourcing all genetic operators to outside agents, in this case, humans. The algorithm has no need for knowledge of a particular fixed genetic representation as long as there are enough external agents capable of handling those representations, allowing for free-form and evolving genetic representations.

Common genetic representations

Distinction between search space and problem space

Analogous to biology, EAs distinguish between problem space (corresponds to phenotype) and search space (corresponds to genotype). The problem space contains concrete solutions to the problem being addressed, while the search space contains the encoded solutions. The mapping from search space to problem space is called genotype-phenotype mapping. The genetic operators are applied to elements of the search space, and for evaluation, elements of the search space are mapped to elements of the problem space via genotype-phenotype mapping.[18][19]

Relationships between search space and problem space

The importance of an appropriate choice of search space for the success of an EA application was recognized early on.[20][21][22] The following requirements can be placed on a suitable search space and thus on a suitable genotype-phenotype mapping:[23][24]

Completeness

All possible admissible solutions must be contained in the search space.

Redundancy

When more possible genotypes exist than phenotypes, the genetic representation of the EA is called redundant. In nature, this is termed a degenerate genetic code. In the case of a redundant representation, neutral mutations are possible. These are mutations that change the genotype but do not affect the phenotype. Thus, depending on the use of the genetic operators, there may be phenotypically unchanged offspring, which can lead to unnecessary fitness determinations, among other things. Since the evaluation in real-world applications usually accounts for the lion's share of the computation time, it can slow down the optimization process. In addition, this can cause the population to have higher genotypic diversity than phenotypic diversity, which can also hinder evolutionary progress.

In biology, the Neutral Theory of Molecular Evolution states that this effect plays a dominant role in natural evolution. This has motivated researchers in the EA community to examine whether neutral mutations can improve EA functioning[25] by giving populations that have converged to a local optimum a way to escape that local optimum through genetic drift. This is discussed controversially and there are no conclusive results on neutrality in EAs.[26][27] On the other hand, there are other proven measures to handle premature convergence.

Locality

The locality of a genetic representation corresponds to the degree to which distances in the search space are preserved in the problem space after genotype-phenotype mapping. That is, a representation has a high locality exactly when neighbors in the search space are also neighbors in the problem space. In order for successful schemata not to be destroyed by genotype-phenotype mapping after a minor mutation, the locality of a representation must be high.

Scaling

In genotype-phenotype mapping, the elements of the genotype can be scaled (weighted) differently. The simplest case is uniform scaling: all elements of the genotype are equally weighted in the phenotype. A common scaling is exponential. If integers are binary coded, the individual digits of the resulting binary number have exponentially different weights in representing the phenotype.

Example: The number 90 is written in binary (i.e., in base two) as 1011010. If now one of the front digits is changed in the binary notation, this has a significantly greater effect on the coded number than any changes at the rear digits (the selection pressure has an exponentially greater effect on the front digits).

For this reason, exponential scaling has the effect of randomly fixing the "posterior" locations in the genotype before the population gets close enough to the optimum to adjust for these subtleties.

Hybridization and repair in genotype-phenotype mapping

When mapping the genotype to the phenotype being evaluated, domain-specific knowledge can be used to improve the phenotype and/or ensure that constraints are met.[28][29] This is a commonly used method to improve EA performance in terms of runtime and solution quality. It is illustrated below by two of the three examples.

Examples

Example of a direct representation

An obvious and commonly used encoding for the traveling salesman problem and related tasks is to number the cities to be visited consecutively and store them as integers in the chromosome. The genetic operators must be suitably adapted so that they only change the order of the cities (genes) and do not cause deletions or duplications.[30][31] Thus, the gene order corresponds to the city order and there is a simple one-to-one mapping.

Example of a complex genotype-phenotype mapping.

In a scheduling task with heterogeneous and partially alternative resources to be assigned to a set of subtasks, the genome must contain all necessary information for the individual scheduling operations or it must be possible to derive them from it. In addition to the order of the subtasks to be executed, this includes information about the resource selection.[32] A phenotype then consists of a list of subtasks with their start times and assigned resources. In order to be able to create this, as many allocation matrices must be created as resources can be allocated to one subtask at most. In the simplest case this is one resource, e.g., one machine, which can perform the subtask. An allocation matrix is a two-dimensional matrix, with one dimension being the available time units and the other being the resources to be allocated. Empty matrix cells indicate availability, while an entry indicates the number of the assigned subtask. The creation of allocation matrices ensures firstly that there are no inadmissible multiple allocations. Secondly, the start times of the subtasks can be read from it as well as the assigned resources.[33]

A common constraint when scheduling resources to subtasks is that a resource can only be allocated once per time unit and that the reservation must be for a contiguous period of time.[34] To achieve this in a timely manner, which is a common optimization goal and not a constraint, a simple heuristic can be used: Allocate the required resource for the desired time period as early as possible, avoiding duplicate reservations. The advantage of this simple procedure is twofold: it avoids the constraint and helps the optimization.

If the scheduling problem is modified to the scheduling of workflows instead of independent subtasks, at least some of the work steps of a workflow have to be executed in a given order.[35] If the previously described scheduling heuristic now determines that the predecessor of a work step is not completed when it should be started itself, the following repair mechanism can help: Postpone the scheduling of this work step until all its predecessors are finished.[33] Since the genotype remains unchanged and repair is performed only at the phenotype level, it is also called phenotypic repair.

Example of a heuristic-based genotype-phenotype mapping

The following layout planning task[36] is intended to illustrate a different use of a heuristic in genotype-phenotype mapping: On a rectangular surface different geometric types of objects are to be arranged in such a way that as little area as possible remains unused. The objects can be rotated, must not overlap after placement, and must be positioned completely on the surface. A related application would be scrap minimization when cutting parts from a steel plate or fabric sheet.

The coordinates of the centers of the objects and a rotation angle reduced to possible isomorphisms of the geometry of the objects can be considered as variables to be determined. If this is done directly by an EA, there will probably be a lot of overlaps. To avoid this, only the angle and the coordinate of one side of the rectangle are determined by the EA. Each object is now rotated and positioned on the edge of that side, shifting it if necessary so that it is inside the rectangle when it is subsequently moved. Then it is moved parallel to the other side until it touches another object or reaches the opposite end of the rectangle. In this way, overlaps are avoided and the unused area is reduced per placement, but not in general, which is left to optimization.[37]

References

  1. Eiben, A.E.; Smith, J.E. (2015) (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 40. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. http://link.springer.com/10.1007/978-3-662-44874-8. 
  2. Rothlauf, Franz (2002) (in en). Representations for Genetic and Evolutionary Algorithms. Studies in Fuzziness and Soft Computing. 104. Heidelberg: Physica-Verlag HD. pp. 31. doi:10.1007/978-3-642-88094-0. ISBN 978-3-642-88096-4. http://link.springer.com/10.1007/978-3-642-88094-0. 
  3. Eiben, A.E.; Smith, J.E. (2015). "Representation and the Roles of Variation Operators" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 49–51. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. http://link.springer.com/10.1007/978-3-662-44874-8. 
  4. Eiben, A.E.; Smith, J.E. (2015). "Popular Evolutionary Algorithm Variants" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 99–118. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. http://link.springer.com/10.1007/978-3-662-44874-8. 
  5. Fogel, D.B. (1995). "Phenotypes, genotypes, and operators in evolutionary computation" (in en). Proceedings of 1995 IEEE International Conference on Evolutionary Computation. 1. Perth, WA, Australia: IEEE. pp. 193–198. doi:10.1109/ICEC.1995.489143. ISBN 978-0-7803-2759-7. https://ieeexplore.ieee.org/document/489143. 
  6. Tomáš Kuthan and Jan Lánský. "Genetic Algorithms in Syllable-Based Text Compression". 2007. p. 26.
  7. Eiben, A.E.; Smith, J.E. (2015). "Representation, Mutation, and Recombination" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 49–78. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. http://link.springer.com/10.1007/978-3-662-44874-8. 
  8. Goldberg, David E. (1989) (in en). Genetic algorithms in search, optimization, and machine learning. Reading, Mass.: Addison-Wesley. ISBN 0-201-15767-5. OCLC 17674450. 
  9. Michalewicz, Zbigniew (1996) (in en). Genetic Algorithms + Data Structures = Evolution Programs. 3rd, revised and extended edition. Berlin, Heidelberg: Springer. ISBN 978-3-662-03315-9. OCLC 851375253. 
  10. 10.0 10.1 Whitley, Darrell (1994). "A genetic algorithm tutorial" (in en). Statistics and Computing 4 (2). doi:10.1007/BF00175354. ISSN 0960-3174. http://link.springer.com/10.1007/BF00175354. 
  11. Herrera, F.; Lozano, M.; Verdegay, J.L. (1998). "Tackling Real-Coded Genetic Algorithms: Operators and Tools for Behavioural Analysis.". Artificial Intelligence Review 12 (4): 265–319. doi:10.1023/A:1006504901164. http://link.springer.com/10.1023/A:1006504901164. 
  12. Blume, Christian; Jakob, Wilfried (2002), "GLEAM - An Evolutionary Algorithm for Planning and Control Based on Evolution Strategy", Conf. Proc. of Genetic and Evolutionary Computation Conference (GECCO 2002) Late Breaking Papers: pp. 31–38, https://publikationen.bibliothek.kit.edu/170053025/3814288, retrieved 2023-01-01 
  13. Hitomi, Nozomi; Selva, Daniel (2018), "Constellation optimization using an evolutionary algorithm with a variable-length chromosome", 2018 IEEE Aerospace Conference (IEEE): pp. 1–12, doi:10.1109/AERO.2018.8396743, ISBN 978-1-5386-2014-4, https://ieeexplore.ieee.org/document/8396743 
  14. De Jong, Kenneth A. (2006). "Representation" (in en). Evolutionary computation : a unified approach. New Delhi: Prentice-Hall of India. pp. 72–75. ISBN 978-81-203-3002-3. OCLC 276452339. https://www.worldcat.org/oclc/276452339. 
  15. Pawar, Sunil Nilkanth; Bichkar, Rajankumar Sadashivrao (2015). "Genetic algorithm with variable length chromosomes for network intrusion detection" (in en). International Journal of Automation and Computing 12 (3): 337–342. doi:10.1007/s11633-014-0870-x. ISSN 1476-8186. 
  16. Schwefel, Hans-Paul (1995) (in en). Evolution and Optimum Seeking. New York: Wiley & Sons. ISBN 0-471-57148-2. OCLC 30701094. https://www.researchgate.net/publication/220690578_Evolution_and_Optimum_Seeking. 
  17. Koza, John R. (1989), Sridharan, N.S., ed., "Hierarchical genetic algorithms operating on populations of computer programs", Proceedings of the Eleventh International Joint Conference on Artificial Intelligence IJCAI-89 (San Mateo, CA, USA: Morgan Kaufmann) 1: pp. 768–774 
  18. Rothlauf, Franz (2002) (in en). Representations for Genetic and Evolutionary Algorithms. Studies in Fuzziness and Soft Computing. 104. Heidelberg: Physica-Verlag HD. doi:10.1007/978-3-642-88094-0. ISBN 978-3-642-88096-4. http://link.springer.com/10.1007/978-3-642-88094-0. 
  19. Whigham, Peter A.; Dick, Grant; Maclaurin, James (2017). "On the mapping of genotype to phenotype in evolutionary algorithms" (in en). Genetic Programming and Evolvable Machines 18 (3): 353–361. doi:10.1007/s10710-017-9288-x. ISSN 1389-2576. http://link.springer.com/10.1007/s10710-017-9288-x. 
  20. Caruana, Richard A.; Schaffer, J. David (1988), "Representation and Hidden Bias: Gray vs. Binary Coding for Genetic Algorithms" (in en), Machine Learning Proceedings 1988 (Elsevier): pp. 153–161, doi:10.1016/b978-0-934613-64-4.50021-9, ISBN 978-0-934613-64-4, https://linkinghub.elsevier.com/retrieve/pii/B9780934613644500219, retrieved 2023-01-19 
  21. Liepins, Gunar E.; Vose, Michael D. (1990). "Representational issues in genetic optimization" (in en). Journal of Experimental & Theoretical Artificial Intelligence 2 (2): 101–115. doi:10.1080/09528139008953717. ISSN 0952-813X. http://www.tandfonline.com/doi/abs/10.1080/09528139008953717. 
  22. Coli, M.; Palazzari, P. (1995), "Searching for the optimal coding in genetic algorithms", Proceedings of 1995 IEEE International Conference on Evolutionary Computation (IEEE), doi:10.1109/ICEC.1995, ISBN 978-0-7803-2759-7 
  23. Eiben, Agoston E. (2015). "Representation (Definition of Individuals)" (in en). Introduction to evolutionary computing. J. E. Smith (2nd ed.). Berlin, Heidelberg: Springer. pp. 28–30. ISBN 978-3-662-44874-8. OCLC 913232837. 
  24. Rothlauf, Franz (2006). "Three Elements of a Theory of Representations" (in en). Representations for genetic and evolutionary algorithms (2nd ed.). Heidelberg: Springer. pp. 33–96. ISBN 978-3-540-32444-7. OCLC 262692044. 
  25. Galván-López, Edgar; Dignum, Stephen; Poli, Riccardo (2008), O’Neill, Michael; Vanneschi, Leonardo; Gustafson, Steven et al., eds., "The Effects of Constant Neutrality on Performance and Problem Hardness in GP", Genetic Programming, Lecture Notes in Computer Science (Berlin, Heidelberg: Springer) 4971: pp. 312–324, doi:10.1007/978-3-540-78671-9_27, ISBN 978-3-540-78670-2, http://link.springer.com/10.1007/978-3-540-78671-9_27, retrieved 2023-01-21 
  26. Galván-López, Edgar; Poli, Riccardo; Kattan, Ahmed; O’Neill, Michael; Brabazon, Anthony (2011). "Neutrality in evolutionary algorithms… What do we know?" (in en). Evolving Systems 2 (3): 145–163. doi:10.1007/s12530-011-9030-5. ISSN 1868-6478. http://link.springer.com/10.1007/s12530-011-9030-5. 
  27. Knowles, Joshua D.; Watson, Richard A. (2002), Guervós, Juan Julián Merelo; Adamidis, Panagiotis; Beyer, Hans-Georg et al., eds., "On the Utility of Redundant Encodings in Mutation-Based Evolutionary Search", Parallel Problem Solving from Nature — PPSN VII (Berlin, Heidelberg: Springer) 2439: pp. 88–98, doi:10.1007/3-540-45712-7_9, ISBN 978-3-540-44139-7, http://link.springer.com/10.1007/3-540-45712-7_9, retrieved 2023-01-21 
  28. Eiben, A.E.; Smith, J.E. (2015). "Hybridisation During Genotype to Phenotype Mapping" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 177–178. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. http://link.springer.com/10.1007/978-3-662-44874-8. 
  29. Hart, Emma; Ross, Peter; Nelson, Jeremy (1998). "Solving a Real-World Problem Using an Evolving Heuristically Driven Schedule Builder" (in en). Evolutionary Computation 6 (1): 61–80. doi:10.1162/evco.1998.6.1.61. ISSN 1063-6560. PMID 10021741. https://direct.mit.edu/evco/article/6/1/61-80/816. 
  30. Eiben, A.E.; Smith, J.E. (2015). "Permutation Representation" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 67–74. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. http://link.springer.com/10.1007/978-3-662-44874-8. 
  31. Larrañaga, P.; Kuijpers, C.M.H.; Murga, R.H.; Inza, I.; Dizdarevic, S. (1999). "Genetic Algorithms for the Travelling Salesman Problem: A Review of Representations and Operators". Artificial Intelligence Review 13 (2): 129–170. doi:10.1023/A:1006529012972. http://link.springer.com/10.1023/A:1006529012972. 
  32. Bruns, Ralf (1997-01-01). "Evolutionary computation approaches for scheduling". in Baeck, Thomas (in en). Handbook of Evolutionary Computation. CRC Press. doi:10.1201/9780367802486. ISBN 978-0-367-80248-6. https://www.taylorfrancis.com/books/9781420050387. 
  33. 33.0 33.1 Jakob, Wilfried; Strack, Sylvia; Quinte, Alexander; Bengel, Günther; Stucky, Karl-Uwe; Süß, Wolfgang (2013-04-22). "Fast Rescheduling of Multiple Workflows to Constrained Heterogeneous Resources Using Multi-Criteria Memetic Computing" (in en). Algorithms 6 (2): 245–277. doi:10.3390/a6020245. ISSN 1999-4893. 
  34. Brucker, Peter (2007) (in en). Scheduling Algorithms. Berlin, Heidelberg: Springer. doi:10.1007/978-3-540-69516-5. ISBN 978-3-540-69515-8. http://link.springer.com/10.1007/978-3-540-69516-5. 
  35. Sakellariou, Rizos; Zhao, Henan; Tsiakkouri, Eleni; Dikaiakos, Marios D. (2007), Gorlatch, Sergei; Danelutto, Marco, eds., "Scheduling Workflows with Budget Constraints" (in en), Integrated Research in GRID Computing (Boston, MA: Springer US): pp. 189–202, doi:10.1007/978-0-387-47658-2_14, ISBN 978-0-387-47656-8, http://link.springer.com/10.1007/978-0-387-47658-2_14, retrieved 2023-01-20 
  36. Fujita, Kikuo; Akagi, Shinsuke; Hirokawa, Noriyasu (1993-09-19). "Hybrid Approach for Optimal Nesting Using a Genetic Algorithm and a Local Minimization Algorithm". Proceedings of the ASME 1993 Design Technical Conferences. 19th Design Automation Conference: Volume 1 — Mechanical System Dynamics; Concurrent and Robust Design; Design for Assembly and Manufacture; Genetic Algorithms in Design and Structural Optimization. Albuquerque, New Mexico, USA: American Society of Mechanical Engineers. pp. 477–484. doi:10.1115/DETC1993-0337. ISBN 978-0-7918-1181-8. https://asmedigitalcollection.asme.org/IDETC-CIE/proceedings/DETC93/11818/477/1104871. 
  37. Jakob, Wilfried (2021), "Layout Planning as an Example for Smart Handling of Complex Constraints", Applying Evolutionary Algorithms Successfully - A Guide Gained from Real-world Applications., KIT Scientific Working Papers, vol.170 (Karlsruhe: KIT Scientific Publishing): pp. 12–14, doi:10.5445/IR/1000135763, https://publikationen.bibliothek.kit.edu/1000135763/121278298