Chromosome (genetic algorithm)

From HandWiki
Revision as of 22:10, 6 February 2024 by Sherlock (talk | contribs) (url)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Set of parameters for a genetic or evolutionary algorithm

In genetic algorithms (GA), or more general, evolutionary algorithms (EA), a chromosome (also sometimes called a genotype) is a set of parameters which define a proposed solution of the problem that the evolutionary algorithm is trying to solve. The set of all solutions, also called individuals according to the biological model, is known as the population.[1][2] The genome of an individual consists of one, more rarely of several,[3][4] chromosomes and corresponds to the genetic representation of the task to be solved. A chromosome is composed of a set of genes, where a gene consists of one or more semantically connected parameters, which are often also called decision variables. They determine one or more phenotypic characteristics of the individual or at least have an influence on them.[2] In the basic form of genetic algorithms, the chromosome is represented as a binary string,[5] while in later variants[6][7] and in EAs in general, a wide variety of other data structures are used.[8][9][10]

Chromosome design

When creating the genetic representation of a task, it is determined which decision variables and other degrees of freedom of the task should be improved by the EA and possible additional heuristics and how the genotype-phenotype mapping should look like. The design of a chromosome translates these considerations into concrete data structures for which an EA then has to be selected, configured, extended, or, in the worst case, created. Finding a suitable representation of the problem domain for a chromosome is an important consideration, as a good representation will make the search easier by limiting the search space; similarly, a poorer representation will allow a larger search space.[11] In this context, suitable mutation and crossover operators[2] must also be found or newly defined to fit the chosen chromosome design. An important requirement for these operators is that they not only allow all points in the search space to be reached in principle, but also make this as easy as possible.[12][13]

The following requirements must be met by a well-suited chromosome:

  • It must allow the accessibility of all admissible points in the search space.
  • Design of the chromosome in such a way that it covers only the search space and no additional areas. so that there is no redundancy or only as little redundancy as possible.
  • Observance of strong causality: small changes in the chromosome should only lead to small changes in the phenotype.[14] This is also called locality of the relationship between search and problem space.
  • Designing the chromosome in such a way that it excludes prohibited regions in the search space completely or as much as possible.

While the first requirement is indispensable, depending on the application and the EA used, one usually only has to be satisfied with fulfilling the remaining requirements as far as possible. It should be noted, however, that the evolutionary search is supported and possibly considerably accelerated by a fulfillment as complete as possible.

Examples of chromosomes

Chromosomes for binary codings

In their classical form, GAs use bit strings and map the decision variables to be optimized onto them. An example for one boolean and three integer decision variables with the value ranges [math]\displaystyle{ 0 \leq D_1 \leq 60 }[/math], [math]\displaystyle{ 28 \leq D_2 \leq 30 }[/math] and [math]\displaystyle{ -12 \leq D_3 \leq 14 }[/math] may illustrate this:

Example representation of four decision variables in a bitstring
decision variable: [math]\displaystyle{ D_1 = 22 }[/math] [math]\displaystyle{ D_2 = 29 }[/math] [math]\displaystyle{ D_3 = -4 }[/math] [math]\displaystyle{ D_4 = 0 }[/math]
bits: 0 1 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0
position: 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Note that the negative number here is given in two's complement. This straight forward representation uses five bits to represent the three values of [math]\displaystyle{ D_2 }[/math], although two bits would suffice. This is a significant redundancy. An improved alternative, where 28 is to be added for the genotype-phenotype mapping, could look like this:

Example of an improved representation of the four decision variables
decision variable: [math]\displaystyle{ D_1 = 22 }[/math] [math]\displaystyle{ D'_2 = 1 }[/math] [math]\displaystyle{ D_3 = -4 }[/math] [math]\displaystyle{ D_4 = 0 }[/math]
bits: 0 1 0 1 1 0 0 1 1 1 1 0 0 0
position: 14 13 12 11 10 9 8 7 6 5 4 3 2 1

with [math]\displaystyle{ D_2 = 28 + D'_2 = 29 }[/math].

Chromosomes with real-valued or integer genes

For the processing of tasks with real-valued or mixed-integer decision variables, EAs such as the evolution strategy[15] or the real-coded GAs[16][17][18] are suited. In the case of mixed-integer values, rounding is often used, but this represents some violation of the redundancy requirement. If the necessary precisions of the real values can be reasonably narrowed down, this violation can be remedied by using integer-coded GAs.[19][20] For this purpose, the valid digits of real values are mapped to integers by multiplication with a suitable factor. For example, 12.380 becomes the integer 12380 by multiplying by 1000. This must of course be taken into account in genotype-phenotype mapping for evaluation and result presentation. A common form is a chromosome consisting of a list or an array of integer or real values.

Chromosomes for permutations

Combinatorial problems are mainly concerned with finding an optimal sequence of a set of elementary items. As an example, consider the problem of the traveling salesman who wants to visit a given number of cities exactly once on the shortest possible tour. The simplest and most obvious mapping onto a chromosome is to number the cities consecutively, to interpret a resulting sequence as permutation and to store it directly in a chromosome, where one gene corresponds to the ordinal number of a city.[21] Then, however, the variation operators may only change the gene order and not remove or duplicate any genes.[22] The chromosome thus contains the path of a possible tour to the cities. As an example the sequence [math]\displaystyle{ 3,5,7,1,4,2,9,6,8 }[/math] of nine cities may serve, to which the following chromosome corresponds:

3 5 7 1 4 2 9 6 8

In addition to this encoding frequently called path representation, there are several other ways of representing a permutation, for example the ordinal representation or the matrix representation.[22][23]

Chromosomes for co-evolution

When a genetic representation contains, in addition to the decision variables, additional information that influences evolution and/or the mapping of the genotype to the phenotype and is itself subject to evolution, this is referred to as co-evolution. A typical example is the evolution strategy (ES), which includes one or more mutation step sizes as strategy parameters in each chromosome.[15] Another example is an additional gene to control a selection heuristic for resource allocation in a scheduling tasks.[24]

This approach is based on the assumption that good solutions are based on an appropriate selection of strategy parameters or on control gene(s) that influences genotype-phenotype mapping. The success of the ES gives evidence to this assumption.

Chromosomes for complex representations

The chromosomes presented above are well suited for processing tasks of continuous, mixed-integer, pure-integer or combinatorial optimization. For a combination of these optimization areas, on the other hand, it becomes increasingly difficult to map them to simple strings of values, depending on the task. The following extension of the gene concept is proposed by the EA GLEAM (General Learning Evolutionary Algorithm and Method) for this purpose:[25] A gene is considered to be the description of an element or elementary trait of the phenotype, which may have multiple parameters. For this purpose, gene types are defined that contain as many parameters of the appropriate data type as are required to describe the particular element of the phenotype. A chromosome now consists of genes as data objects of the gene types, whereby, depending on the application, each gene type occurs exactly once as a gene or can be contained in the chromosome any number of times. The latter leads to chromosomes of dynamic length, as they are required for some problems.[26][27] The gene type definitions also contain information on the permissible value ranges of the gene parameters, which are observed during chromosome generation and by corresponding mutations, so they cannot lead to lethal mutations. For tasks with a combinatorial part, there are suitable genetic operators that can move or reposition genes as a whole, i.e. with their parameters.

Three exemplary genes matching the adjacent gene type definitions in a chromosome organized as a list
Three exemplary genes matching the adjacent gene type definitions in a chromosome organized as a list

A scheduling task is used as an illustration, in which workflows are to be scheduled that require different numbers of heterogeneous resources. A workflow specifies which work steps can be processed in parallel and which have to be executed one after the other. In this context, heterogeneous resources mean different processing times at different costs in addition to different processing capabilities.[24] Each scheduling operation therefore requires one or more parameters that determine the resource selection, where the value ranges of the parameters depend on the number of alternative resources available for each work step. A suitable chromosome provides one gene type per work step and in this case one corresponding gene, which has one parameter for each required resource. The order of genes determines the order of scheduling operations and, therefore, the precedence in case of allocation conflicts. The exemplary gene type definition of work step 15 with two resources, for which there are four and seven alternatives respectively, would then look as shown in the left image. Since the parameters represent indices in lists of available resources for the respective work step, their value range starts at 0. The right image shows an example of three genes of a chromosome belonging to the gene types in list representation.

Syntax tree of a formula example

Chromosomes for tree representations

Tree representations in a chromosome are used by genetic programming, an EA type for generating computer programs or circuits.[10] The trees correspond to the syntax trees generated by a compiler as internal representation when translating a computer program. The adjacent figure shows the syntax tree of a mathematical expression as an example. Mutation operators can rearrange, change or delete subtrees depending on the represented syntax structure. Recombination is performed by exchanging suitable subtrees.[28]

Bibliography

References

  1. "Introduction to genetic algorithms: IV. Genetic Algorithm". http://www.obitko.com/tutorials/genetic-algorithms/ga-basic-description.php. Retrieved 12 August 2015. 
  2. 2.0 2.1 2.2 Eiben, A.E.; Smith, J.E. (2015). "Components of Evolutionary Algorithms" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 28–34. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. https://link.springer.com/10.1007/978-3-662-44874-8. 
  3. Baine, Nicholas (2008), "A simple multi-chromosome genetic algorithm optimization of a Proportional-plus-Derivative Fuzzy Logic Controller", NAFIPS 2008 - 2008 Annual Meeting of the North American Fuzzy Information Processing Society (IEEE): pp. 1–5, doi:10.1109/NAFIPS.2008.4531273, ISBN 978-1-4244-2351-4, https://ieeexplore.ieee.org/document/4531273 
  4. Peng, Jin; Chu, Zhang Shu (2010), "A Hybrid Multi-chromosome Genetic Algorithm for the Cutting Stock Problem", 3rd International Conference on Information Management, Innovation Management and Industrial Engineering (IEEE): pp. 508–511, doi:10.1109/ICIII.2010.128, ISBN 978-1-4244-8829-2, https://ieeexplore.ieee.org/document/5694457 
  5. Holland, John H. (1992) (in en). Adaptation in natural and artificial systems. Cambridge, Mass.: MIT Press. ISBN 0-585-03844-9. OCLC 42854623. 
  6. Janikow, C.Z.; Michalewicz, Z. (1991), Belew, Richard K.; Booker, Lashon B., eds., "An Experimental Comparison of Binary and Floating Point Representations in Genetic Algorithms", Proceedings of the Fourth International Conference on Genetic Algorithms (San Francisco, CA: Morgan Kaufmann Publishers): pp. 31–36, ISBN 1-55860-208-9, http://www.cs.umsl.edu/~janikow/publications/1991/GAbin/text.pdf 
  7. Whitley, Darrell (June 1994). "A genetic algorithm tutorial". Statistics and Computing 4 (2). doi:10.1007/BF00175354. 
  8. Whitley, Darrell (2001). "An overview of evolutionary algorithms: practical issues and common pitfalls" (in en). Information and Software Technology 43 (14): 817–831. doi:10.1016/S0950-5849(01)00188-4. https://linkinghub.elsevier.com/retrieve/pii/S0950584901001884. 
  9. Bäck, Thomas; Hoffmeister, Frank; Schwefel, Hans-Paul (1991), Belew, Richard K.; Booker, Lashon B., eds., "A Survey of Evolution Strategies", Proceedings of the Fourth International Conference on Genetic Algorithms (San Francisco, CA: Morgan Kaufmann Publishers): pp. 2–9, ISBN 1-55860-208-9, https://www.academia.edu/27025389 
  10. 10.0 10.1 Koza, John R. (1992). Genetic programming : on the programming of computers by means of natural selection. Cambridge, Mass.: MIT Press. ISBN 0-262-11170-5. OCLC 26263956. https://www.worldcat.org/oclc/26263956. 
  11. "Genetic algorithms". http://www.cse.unsw.edu.au/~billw/cs9414/notes/ml/05ga/05ga.html. Retrieved 12 August 2015. 
  12. Rothlauf, Franz (2002). Representations for Genetic and Evolutionary Algorithms. Studies in Fuzziness and Soft Computing. 104. Heidelberg: Physica-Verlag HD. pp. 31. doi:10.1007/978-3-642-88094-0. ISBN 978-3-642-88096-4. http://link.springer.com/10.1007/978-3-642-88094-0. 
  13. Eiben, A.E.; Smith, J.E. (2015). "Representation and the Roles of Variation Operators" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 49–51. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. https://link.springer.com/10.1007/978-3-662-44874-8. 
  14. Galván-López, Edgar; McDermott, James; O'Neill, Michael; Brabazon, Anthony (2010-07-07). "Towards an understanding of locality in genetic programming" (in en). Proceedings of the 12th annual conference on Genetic and evolutionary computation. Portland Oregon USA: ACM. pp. 901–908. doi:10.1145/1830483.1830646. ISBN 978-1-4503-0072-8. https://dl.acm.org/doi/10.1145/1830483.1830646. 
  15. 15.0 15.1 Schwefel, Hans-Paul (1995). Evolution and optimum seeking. New York: John Wiley & Sons. ISBN 0-471-57148-2. OCLC 30701094. https://www.researchgate.net/publication/220690578_Evolution_and_Optimum_Seeking. 
  16. Eshelman, Larry J.; Schaffer, J. David (1993), "Real-Coded Genetic Algorithms and Interval-Schemata" (in en), Foundations of Genetic Algorithms (Elsevier) 2: pp. 187–202, doi:10.1016/b978-0-08-094832-4.50018-0, ISBN 978-0-08-094832-4, https://linkinghub.elsevier.com/retrieve/pii/B9780080948324500180, retrieved 2023-01-26 
  17. Michalewicz, Zbigniew (1996) (in en). Genetic Algorithms + Data Structures = Evolution Programs. Third, revised and extended edition. Berlin, Heidelberg: Springer. ISBN 978-3-662-03315-9. OCLC 851375253. 
  18. Deep, Kusum; Singh, Krishna Pratap; Kansal, M.L.; Mohan, C. (June 2009). "A real coded genetic algorithm for solving integer and mixed integer optimization problems" (in en). Applied Mathematics and Computation 212 (2): 505–518. doi:10.1016/j.amc.2009.02.044. https://linkinghub.elsevier.com/retrieve/pii/S0096300309001830. 
  19. Wang, Fuchang; Cao, Huirong; Qian, Xiaoshi (2011), Liu, Baoxiang; Chai, Chunlai, eds., "Decimal-Integer-Coded Genetic Algorithm for Trimmed Estimator of the Multiple Linear Errors in Variables Model", Information Computing and Applications, LNCS 7030 (Berlin, Heidelberg: Springer): pp. 359–366, doi:10.1007/978-3-642-25255-6_46, ISBN 978-3-642-25254-9, http://link.springer.com/10.1007/978-3-642-25255-6_46, retrieved 2023-01-23 
  20. Cheng, Xueli; An, Linchao; Zhang, Zhenhua (2019). "Integer Encoding Genetic Algorithm for Optimizing Redundancy Allocation of Series-parallel Systems". Journal of Engineering Science and Technology Review 12 (1): 126–136. doi:10.25103/JESTR.121.15. 
  21. Eiben, A.E.; Smith, J.E. (2015). "Permutation Representation" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 67–74. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. https://link.springer.com/10.1007/978-3-662-44874-8. 
  22. 22.0 22.1 Larrañaga, P.; Kuijpers, C.M.H.; Murga, R.H.; Inza, I.; Dizdarevic, S. (1999). "Genetic Algorithms for the Travelling Salesman Problem: A Review of Representations and Operators". Artificial Intelligence Review 13 (2): 129–170. doi:10.1023/A:1006529012972. http://link.springer.com/10.1023/A:1006529012972. 
  23. Whitley, Darrell (2000). "Permutations". in Fogel, David B. (in en). Evolutionary computation. Vol. 1, Basic algorithms and operators. Bristol: Institute of Physics Pub. pp. 139–150. ISBN 0-585-30560-9. OCLC 45730387. 
  24. 24.0 24.1 Jakob, Wilfried; Strack, Sylvia; Quinte, Alexander; Bengel, Günther; Stucky, Karl-Uwe; Süß, Wolfgang (2013-04-22). "Fast Rescheduling of Multiple Workflows to Constrained Heterogeneous Resources Using Multi-Criteria Memetic Computing" (in en). Algorithms 6 (2): 245–277. doi:10.3390/a6020245. ISSN 1999-4893. 
  25. Blume, Christian; Jakob, Wilfried (2002), "GLEAM - An Evolutionary Algorithm for Planning and Control Based on Evolution Strategy", Conf. Proc. of Genetic and Evolutionary Computation Conference (GECCO 2002) Late Breaking Papers: pp. 31–38, https://publikationen.bibliothek.kit.edu/170053025/3814288, retrieved 2023-01-01 
  26. Pawar, Sunil Nilkanth; Bichkar, Rajankumar Sadashivrao (June 2015). "Genetic algorithm with variable length chromosomes for network intrusion detection" (in en). International Journal of Automation and Computing 12 (3): 337–342. doi:10.1007/s11633-014-0870-x. ISSN 1476-8186. 
  27. Blume, Christian (2000), Cagnoni, Stefano, ed., "Optimized Collision Free Robot Move Statement Generation by the Evolutionary Software GLEAM" (in en), Real-World Applications of Evolutionary Computing, Lecture Notes in Computer Science (Berlin, Heidelberg: Springer Berlin Heidelberg) 1803: pp. 330–341, doi:10.1007/3-540-45561-2_32, ISBN 978-3-540-67353-8, http://link.springer.com/10.1007/3-540-45561-2_32, retrieved 2023-06-25 
  28. Eiben, A.E.; Smith, J.E. (2015). "Tree Representation" (in en). Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 75–78. doi:10.1007/978-3-662-44874-8. ISBN 978-3-662-44873-1. https://link.springer.com/10.1007/978-3-662-44874-8.