Information fluctuation complexity

From HandWiki
Revision as of 02:34, 27 September 2024 by Metamystical (talk | contribs) (Minor wording changes to improve clarity. Fixed a broken reference.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Information fluctuation complexity is an information-theoretic quantity defined as the fluctuation of information about entropy. It is derivable from fluctuations in the predominance of order and chaos in a dynamic system and has been used as a measure of complexity in many diverse fields. It was introduced in a 1993 paper by Bates and Shepard.[1]

Definition

The information fluctuation complexity of a discrete dynamic system is a function of the probability distribution of its states when it is subject to random external input data. The purpose of driving the system with a rich information source such as a random number generator or a white noise signal is to probe the internal dynamics of the system in much the same way as a frequency-rich impulse is used in signal processing.

If a system has [math]\displaystyle{ N }[/math] possible states and the state probabilities [math]\displaystyle{ p_i }[/math] are known, then its information entropy is

[math]\displaystyle{ \Eta = \sum_{i=1}^N p_i I_i = - \sum_{i=1}^N p_i \log p_i, }[/math]

where [math]\displaystyle{ I_i = -\log p_i }[/math] is the information content of state [math]\displaystyle{ i }[/math].

The information fluctuation complexity of the system is defined as the standard deviation or fluctuation of [math]\displaystyle{ I }[/math] about its mean [math]\displaystyle{ \Eta }[/math]:

[math]\displaystyle{ \sigma_I = \sqrt{\sum_{i=1}^N p_i(I_i - \Eta)^2} = \sqrt{\sum_{i=1}^N p_iI_i^2 - \Eta^2} }[/math]

or

[math]\displaystyle{ \sigma_I = \sqrt{\sum_{i=1}^N p_i \log^2 p_i - \Biggl(\sum_{i=1}^N p_i \log p_i \Biggr)^2}. }[/math]

The fluctuation of state information [math]\displaystyle{ \sigma_I }[/math] is zero in a maximally disordered system with all [math]\displaystyle{ p_i = \frac{1}{N} }[/math]; the system simply mimics its random inputs. [math]\displaystyle{ \sigma_I }[/math] is also zero if the system is perfectly ordered with only one fixed state [math]\displaystyle{ (p_1 = 1) }[/math], regardless of the inputs. [math]\displaystyle{ \sigma_I }[/math] is non-zero between these two extremes with a mixture of higher-probability states and lower-probability states populating state space.

Fluctuation of information allows for memory and computation

As a complex dynamic system evolves over time, how it transitions between states depends on external stimuli in an irregular way. At times it may be more sensitive to external stimuli (unstable) and at other times less sensitive (stable). When a given state has multiple possible next-states, external information determines which one will be next and the system gains this information by following a particular trajectory in state space. However, if several different states all lead to the same next-state, then upon entering the next-state the system loses information about which state preceded it. Thus, a complex system exhibits alternating information gain and loss as it evolves over time. This alternation or fluctuation of information is equivalent to remembering and forgetting — temporary information storage or memory — an essential feature of non-trivial computation.

The gain or loss of information associated with transitions between states can be related to state information. The net information gain of a transition from state [math]\displaystyle{ i }[/math] to state [math]\displaystyle{ j }[/math] is the information gained when leaving state [math]\displaystyle{ i }[/math] less the information lost when entering state [math]\displaystyle{ j }[/math]:

[math]\displaystyle{ \Gamma_{ij} = -\log p_{i \rightarrow j} + \log p_{i \leftarrow j}. }[/math]

Here [math]\displaystyle{ p_{i \rightarrow j} }[/math] is the forward conditional probability that if the present state is [math]\displaystyle{ i }[/math] then the next state will be [math]\displaystyle{ j }[/math] and [math]\displaystyle{ p_{i \leftarrow j} }[/math] is the reverse conditional probability that if the present state is [math]\displaystyle{ j }[/math] then the previous state was [math]\displaystyle{ i }[/math]. The conditional probabilities are related to the transition probability [math]\displaystyle{ p_{ij} }[/math], the probability that a transition from state [math]\displaystyle{ i }[/math] to state [math]\displaystyle{ j }[/math] occurs, by:

[math]\displaystyle{ p_{ij} = p_i p_{i \rightarrow j} = p_{i \leftarrow j} p_j. }[/math]

Eliminating the conditional probabilities:

[math]\displaystyle{ \Gamma_{ij} = -\log (p_{ij}/p_i) + \log (p_{ij}/p_j) = \log p_i - \log p_j = I_j - I_i. }[/math]

Therefore the net information gained by the system as a result of the transition depends only on the increase in state information from the initial to the final state. It can be shown that this is true even for multiple consecutive transitions.[1]

[math]\displaystyle{ \Gamma = \Delta I }[/math] is reminiscent of the relation between force and potential energy. [math]\displaystyle{ I }[/math] is like potential [math]\displaystyle{ \Phi }[/math] and [math]\displaystyle{ \Gamma }[/math] is like force [math]\displaystyle{ \mathbf{F} }[/math] in [math]\displaystyle{ \mathbf{F}={\nabla \Phi} }[/math]. External information “pushes” a system “uphill” to a state of higher information potential to accomplish information storage, much like pushing a mass uphill to a state of higher gravitational potential stores energy. The amount of energy stored depends only on the final height, not on the path up the hill. Similarly, the amount of information storage does not depend on the transition path between an initial common state and a final rare state. Once a system reaches a rare state with high information potential, it may then "fall" back to a common state, losing previously stored information.

It may be useful to compute the standard deviation of [math]\displaystyle{ \Gamma }[/math] about its mean (which is zero), namely the fluctuation of net information gain [math]\displaystyle{ \sigma_\Gamma }[/math],[1] but [math]\displaystyle{ \sigma_I }[/math] takes into account multi-transition memory loops in state space and therefore should be more indicative of the computational power of a system. Moreover, [math]\displaystyle{ \sigma_I }[/math] is easier to apply because there can be many more transitions than states.

Chaos and order

A dynamic system that is sensitive to external information (unstable) exhibits chaotic behavior whereas one that is insensitive to external information (stable) exhibits orderly behavior. A complex system exhibits both behaviors, fluctuating between them in dynamic balance when subject to a rich information source. The degree of fluctuation is quantified by [math]\displaystyle{ \sigma_I }[/math]; it captures the alternation in the predominance of chaos and order in a complex system as it evolves over time.

Example: rule 110 variant of the elementary cellular automaton[2]

The rule 110 variant of the elementary cellular automaton has been proven to be capable of universal computation. The proof is based on the existence and interactions of cohesive and self-perpetuating cell patterns known as gliders, which are examples of emergent phenomena associated with complex systems and which imply the capability of groups of automaton cells to remember that a glider is passing through them. It is therefore to be expected that there will be memory loops in state space resulting from alternations of information gain and loss, instability and stability, chaos and order.

Consider a 3-cell group of adjacent automaton cells that obey rule 110: end-center-end. The next state of the center cell depends on the present state of itself and the end cells as specified by the rule:

Elementary cellular automaton rule 110
3-cell group 1-1-1 1-1-0 1-0-1 1-0-0 0-1-1 0-1-0 0-0-1 0-0-0
next center cell 0 1 1 0 1 1 1 0

To compute the information fluctuation complexity of this system, attach a driver cell to each end of the 3-cell group to provide random external stimuli like so, driver→end-center-end←driver, such that the rule can be applied to the two end cells. Next determine what the next state will be for each possible present state and for each possible combination of driver cell contents, in order to determine the forward conditional probabilities.

The state diagram of this system is depicted below, with circles representing states and arrows representing transitions between states. The eight possible states of this system, 1-1-1 to 0-0-0 are labeled with the octal equivalent of the 3-bit contents of the 3-cell group: 7 to 0. The transition arrows are labeled with forward conditional probabilities. Notice that there is variability in the divergence and convergence of arrows corresponding to a variability in gain and loss of information originating from the driver cells.

The 3-cell state diagram for the rule 110 elementary cellular automaton showing forward conditional transition probabilities with random stimulation.

The forward conditional probabilities are determined by the proportion of possible driver cell contents that drive a particular transition. For example, for the four possible combinations of two driver cell contents, state 7 leads to states 5, 4, 1 and 0 and therefore [math]\displaystyle{ p_{7 \rightarrow 5} }[/math], [math]\displaystyle{ p_{7 \rightarrow 4} }[/math], [math]\displaystyle{ p_{7 \rightarrow 1} }[/math], and [math]\displaystyle{ p_{7 \rightarrow 0} }[/math] are each ¼ or 25%. Similarly, state 0 leads to states 0, 1, 0 and 1 and therefore [math]\displaystyle{ p_{0 \rightarrow 1} }[/math] and [math]\displaystyle{ p_{0 \rightarrow 0} }[/math] are each ½ or 50%. And so forth.

The state probabilities are related by

[math]\displaystyle{ p_j = \sum_{i=0}^7 p_i p_{i \rightarrow j} }[/math] and [math]\displaystyle{ \sum_{i=0}^7 p_i = 1. }[/math]

These linear algebraic equations can be solved for the state probabilities, with the following results:[2]

Rule 110 state probabilities for 3-cell automaton with random stimulation
p0 p1 p2 p3 p4 p5 p6 p7
2/17 2/17 1/34 5/34 2/17 2/17 2/17 4/17

The information entropy and the complexity can then be computed from the state probabilities:

[math]\displaystyle{ \Eta = - \sum_{i=0}^7 p_i \log_2 p_i = 2.86 \text{ bits}, }[/math]
[math]\displaystyle{ \sigma_I = \sqrt{\sum_{i=0}^7 p_i \log_2^2 p_i - \Eta^2} = 0.56 \text{ bits}. }[/math]

Note that the maximum possible entropy for eight states is [math]\displaystyle{ 3 \text{ bits} }[/math], which is the case when all [math]\displaystyle{ p_i = \frac{1}{8} }[/math]. Thus, rule 110 has a relatively high entropy or state utilization of [math]\displaystyle{ 2.86 \text{ bits} }[/math]. However, this does not preclude a considerable fluctuation of state information about entropy and thus a considerable value of the complexity. Whereas, maximum entropy would preclude complexity.

An alternative method can be used to obtain the state probabilities when the analytical method used above is unfeasible. Simply drive the system at its inputs (the driver cells) with a random source for many generations and observe the state probabilities empirically. When this is done via computer simulation for 10 million generations the results are as follows:[2]

Information variables for the rule 110 elementary cellular automaton
number of cells 3 4 5 6 7 8 9 10 11 12 13
[math]\displaystyle{ \Eta }[/math] (bits) 2.86 3.81 4.73 5.66 6.56 7.47 8.34 9.25 10.09 10.97 11.78
[math]\displaystyle{ \sigma_I }[/math](bits) 0.56 0.65 0.72 0.73 0.79 0.81 0.89 0.90 1.00 1.01 1.15
[math]\displaystyle{ \sigma_I / \Eta }[/math] 0.20 0.17 0.15 0.13 0.12 0.11 0.11 0.10 0.10 0.09 0.10

Since both [math]\displaystyle{ \Eta }[/math] and [math]\displaystyle{ \sigma_I }[/math] increase with system size, their dimensionless ratio [math]\displaystyle{ \sigma_I/\Eta }[/math], the relative information fluctuation complexity, is included to compare systems of different sizes. Notice that the empirical and analytical results agree for the 3-cell automaton and that the relative complexity levels off to about [math]\displaystyle{ 0.10 }[/math] by 10 cells.

In the paper by Bates and Shepard,[1] [math]\displaystyle{ \sigma_I }[/math] is computed for all elementary cellular automaton rules and it was observed that the ones that exhibit slow-moving gliders and possibly stationary objects, as rule 110 does, are highly correlated with large values of [math]\displaystyle{ \sigma_I }[/math]. [math]\displaystyle{ \sigma_I }[/math] can therefore be used as a filter to select candidate rules for universal computation, which is challenging to prove.

Applications

Although the derivation of the information fluctuation complexity formula is based on information fluctuations in dynamic systems, the formula depends only on state probabilities and therefore is also applicable to any probability distribution, including those derived from static images or text.

Over the years the original paper[1] has been referred to by researchers in many diverse fields: complexity theory,[3] complex systems science,[4] complex networks,[5] chaotic dynamics,[6] many-body localization entanglement,[7] environmental engineering,[8] ecological complexity,[9] ecological time-series analysis,[10] ecosystem sustainability,[11] air[12] and water[13] pollution, hydrological wavelet analysis,[14] soil water flow,[15] soil moisture,[16] headwater runoff,[17] groundwater depth,[18] air traffic control,[19] flow patterns[20] and flood events,[21] topology,[22] economics,[23] market forecasting of metal[24] and electricity[25] prices, health informatics,[26] human cognition,[27] human gait kinematics,[28] neurology,[29] EEG analysis,[30] education,[31] investing,[32] artificial life[33] and aesthetics.[34]

References

  1. 1.0 1.1 1.2 1.3 1.4 Bates, John E.; Shepard, Harvey K. (1993-01-18). "Measuring complexity using information fluctuation" (in en). Physics Letters A 172 (6): 416–425. doi:10.1016/0375-9601(93)90232-O. ISSN 0375-9601. Bibcode1993PhLA..172..416B. 
  2. 2.0 2.1 2.2 Bates, John E. (2020-03-30). "Measuring complexity using information fluctuation: a tutorial". https://www.researchgate.net/publication/340284677. 
  3. Atmanspacher, Harald (September 1997). "Cartesian cut, Heisenberg cut, and the concept of complexity" (in en). World Futures 49 (3–4): 333–355. doi:10.1080/02604027.1997.9972639. ISSN 0260-4027. 
  4. Shalizi, Cosma Rohilla (2006), Deisboeck, Thomas S.; Kresh, J. Yasha, eds., "Methods and Techniques of Complex Systems Science: An Overview" (in en), Complex Systems Science in Biomedicine, Topics in Biomedical Engineering International Book Series (Springer US): pp. 33–114, doi:10.1007/978-0-387-33532-2_2, ISBN 978-0-387-33532-2 
  5. Huang, Min; Sun, Zhongkui; Donner, Reik V.; Zhang, Jie; Gua, Shuguang; Zou, Yong (2021-03-09). "Characterizing dynamical transitions by statistical complexity measures based on ordinal pattern transition networks". Chaos: An Interdisciplinary Journal of Nonlinear Science 31 (3): 033127. doi:10.1063/5.0038876. PMID 33810737. https://pubs.aip.org/aip/cha/article/31/3/033127/342127. 
  6. Wackerbauer, Renate (1995-11-01). "Noise-induced stabilization of the Lorenz system". Physical Review E 52 (5): 4745–4749. doi:10.1103/PhysRevE.52.4745. PMID 9963970. Bibcode1995PhRvE..52.4745W. 
  7. Hamilton, Gregory A.; Clark, Bryan K. (2023-02-14). "Quantifying unitary flow efficiency and entanglement for many-body localization". Physical Review B 107 (6): 064203. doi:10.1103/PhysRevB.107.064203. https://link.aps.org/doi/10.1103/PhysRevB.107.064203. 
  8. Singh, Vijay P. (2013-01-10) (in en). Entropy Theory and its Application in Environmental and Water Engineering. John Wiley & Sons. ISBN 978-1-118-42860-3. https://books.google.com/books?id=A8_-5-VWJiAC&pg=PT9. 
  9. Parrott, Lael (2010-11-01). "Measuring ecological complexity" (in en). Ecological Indicators 10 (6): 1069–1076. doi:10.1016/j.ecolind.2010.03.014. ISSN 1470-160X. http://www.sciencedirect.com/science/article/pii/S1470160X10000567. 
  10. Lange, Holger (2006), "Time-series Analysis in Ecology" (in en), eLS (American Cancer Society), doi:10.1038/npg.els.0003276, ISBN 978-0-470-01590-2 
  11. Wang, Chaojun; Zhao, Hongrui (2019-04-18). "Analysis of remote sensing time-series data to foster ecosystem sustainability: use of temporal information entropy". International Journal of Remote Sensing 40 (8): 2880–2894. doi:10.1080/01431161.2018.1533661. ISSN 0143-1161. Bibcode2019IJRS...40.2880W. 
  12. Klemm, Otto; Lange, Holger (1999-12-01). "Trends of air pollution in the Fichtelgebirge Mountains, Bavaria" (in en). Environmental Science and Pollution Research 6 (4): 193–199. doi:10.1007/BF02987325. ISSN 1614-7499. PMID 19005662. 
  13. Wang, Kang; Lin, Zhongbing (2018). "Characterization of the nonpoint source pollution into river at different spatial scales" (in en). Water and Environment Journal 32 (3): 453–465. doi:10.1111/wej.12345. ISSN 1747-6593. 
  14. Labat, David (2005-11-25). "Recent advances in wavelet analyses: Part 1. A review of concepts" (in en). Journal of Hydrology 314 (1): 275–288. doi:10.1016/j.jhydrol.2005.04.003. ISSN 0022-1694. Bibcode2005JHyd..314..275L. http://www.sciencedirect.com/science/article/pii/S0022169405001769. 
  15. Pachepsky, Yakov; Guber, Andrey; Jacques, Diederik; Simunek, Jiri; Van Genuchten, Marthinus Th.; Nicholson, Thomas; Cady, Ralph (2006-10-01). "Information content and complexity of simulated soil water fluxes" (in en). Geoderma. Fractal Geometry Applied to Soil and Related Hierarchical Systems - Fractals, Complexity and Heterogeneity 134 (3): 253–266. doi:10.1016/j.geoderma.2006.03.003. ISSN 0016-7061. Bibcode2006Geode.134..253P. http://www.sciencedirect.com/science/article/pii/S0016706106000504. 
  16. Kumar, Sujay V.; Dirmeyer, Paul A.; Peters-Lidard, Christa D.; Bindlish, Rajat; Bolten, John (2018-01-01). "Information theoretic evaluation of satellite soil moisture retrievals" (in en). Remote Sensing of Environment 204: 392–400. doi:10.1016/j.rse.2017.10.016. ISSN 0034-4257. PMID 32636571. Bibcode2018RSEnv.204..392K. 
  17. Hauhs, Michael; Lange, Holger (2008). "Classification of Runoff in Headwater Catchments: A Physical Problem?" (in en). Geography Compass 2 (1): 235–254. doi:10.1111/j.1749-8198.2007.00075.x. ISSN 1749-8198. 
  18. Liu, Meng; Liu, Dong; Liu, Le (2013-09-01). "Complexity research of regional groundwater depth series based on multiscale entropy: a case study of Jiangsanjiang Branch Bureau in China" (in en). Environmental Earth Sciences 70 (1): 353–361. doi:10.1007/s12665-012-2132-y. ISSN 1866-6299. Bibcode2013EES....70..353L. 
  19. Xing, Jing; Manning, Carol A. (April 2005). "Complexity and Automation Displays of Air Traffic Control: Literature Review and Analysis". https://www.researchgate.net/publication/235016868. 
  20. Wang, Kang; Li, Li (November 2008). "Characterizing Heterogeneous Flow Patterns Using Information Measurements". 2008 First International Conference on Intelligent Networks and Intelligent Systems. pp. 654–657. doi:10.1109/ICINIS.2008.110. 
  21. Al Sawaf, Mohamad Basel; Kawanisi, Kiyosi (2020-11-01). "Assessment of mountain river streamflow patterns and flood events using information and complexity measures" (in en). Journal of Hydrology 590: 125508. doi:10.1016/j.jhydrol.2020.125508. ISSN 0022-1694. Bibcode2020JHyd..59025508A. https://www.sciencedirect.com/science/article/pii/S0022169420309689. 
  22. Javaheri Javid, Mohammad Ali; Alghamdi, Wajdi; Zimmer, Robert; al-Rifaie, Mohammad Majid (2016), Bi, Yaxin; Kapoor, Supriya; Bhatia, Rahul, eds., "A Comparative Analysis of Detecting Symmetries in Toroidal Topology" (in en), Intelligent Systems and Applications: Extended and Selected Results from the SAI Intelligent Systems Conference (IntelliSys) 2015, Studies in Computational Intelligence (Springer International Publishing): pp. 323–344, doi:10.1007/978-3-319-33386-1_16, ISBN 978-3-319-33386-1, https://research.gold.ac.uk/17245/1/2016_Computational_Intelligence_partial_symmetry.pdf 
  23. Jurado-González, Javier; Gómez-Barroso, José Luis (2022-11-28). "Economic complexity and Information Society paradigms: a hybrid contribution to explain economic growth" (in en). Technological and Economic Development of Economy 28 (6): 1871–1896. doi:10.3846/tede.2022.17104. ISSN 2029-4921. https://jau.vgtu.lt/index.php/TEDE/article/view/17104. 
  24. He, Kaijian; Lu, Xingjing; Zou, Yingchao; Keung Lai, Kin (2015-09-01). "Forecasting metal prices with a curvelet based multiscale methodology" (in en). Resources Policy 45: 144–150. doi:10.1016/j.resourpol.2015.03.011. ISSN 0301-4207. Bibcode2015RePol..45..144H. http://www.sciencedirect.com/science/article/pii/S0301420715000367. 
  25. He, Kaijian; Xu, Yang; Zou, Yingchao; Tang, Ling (2015-05-01). "Electricity price forecasts using a Curvelet denoising based approach" (in en). Physica A: Statistical Mechanics and Its Applications 425: 1–9. doi:10.1016/j.physa.2015.01.012. ISSN 0378-4371. http://www.sciencedirect.com/science/article/pii/S037843711500014X. 
  26. Ahmed, Mosabber Uddin (2021), Ahad, Md Atiqur Rahman; Ahmed, Mosabber Uddin, eds., "Complexity Analysis in Health Informatics" (in en), Signal Processing Techniques for Computational Health Informatics, Intelligent Systems Reference Library (Cham: Springer International Publishing) 192: pp. 103–121, doi:10.1007/978-3-030-54932-9_4, ISBN 978-3-030-54932-9, https://doi.org/10.1007/978-3-030-54932-9_4, retrieved 2021-02-01 
  27. Shi Xiujian; Sun Zhiqiang; Li Long; Xie Hongwei (2009). "Human Cognitive Complexity Analysis in Transportation Systems". Logistics. Proceedings: 4361–4368. doi:10.1061/40996(330)637. ISBN 9780784409961. 
  28. Zhang, Shutao; Qian, Jinwu; Shen, Linyong; Wu, Xi; Hu, Xiaowu (October 2015). "Gait complexity and frequency content analyses of patients with Parkinson's disease". 2015 International Symposium on Bioelectronics and Bioinformatics (ISBB): 87–90. doi:10.1109/ISBB.2015.7344930. ISBN 978-1-4673-6609-0. 
  29. Wang, Jisung; Noh, Gyu-Jeong; Choi, Byung-Moon; Ku, Seung-Woo; Joo, Pangyu; Jung, Woo-Sung; Kim, Seunghwan; Lee, Heonsoo (2017-07-13). "Suppressed neural complexity during ketamine- and propofol-induced unconsciousness" (in en). Neuroscience Letters 653: 320–325. doi:10.1016/j.neulet.2017.05.045. ISSN 0304-3940. PMID 28572032. http://www.sciencedirect.com/science/article/pii/S030439401730441X. 
  30. Bola, Michał; Orłowski, Paweł; Płomecka, Martyna; Marchewka, Artur (2019-01-30). "EEG signal diversity during propofol sedation: an increase in sedated but responsive, a decrease in sedated and unresponsive subjects" (in en). bioRxiv: 444281. doi:10.1101/444281. https://www.biorxiv.org/content/10.1101/444281v2. 
  31. Dilger, Alexander (2012-01-01). "Endogenous complexity, specialisation and general education". On the Horizon 20 (1): 49–53. doi:10.1108/10748121211202062. ISSN 1074-8121. 
  32. Ivanyuk, Vera Alekseevna (2015). "Dynamic strategic investment portfolio management model". https://www.elibrary.ru/item.asp?id=24017528. 
  33. Peña, Eric; Sayama, Hiroki (2021-05-02). "Life Worth Mentioning: Complexity in Life-Like Cellular Automata". Artificial Life 27 (2): 105–112. doi:10.1162/artl_a_00348. PMID 34727158. https://direct.mit.edu/artl/article/27/2/105/107883/Life-Worth-Mentioning-Complexity-in-Life-Like. 
  34. Javaheri Javid, Mohammad Ali (2019-11-30). Aesthetic Automata: Synthesis and Simulation of Aesthetic Behaviour in Cellular Automata (doctoral thesis). Goldsmiths, University of London. doi:10.25602/gold.00027681.