Physics:Equation of Artificial Intelligence in the Theory of Entropicity(ToE): Difference between revisions
Line 12: | Line 12: | ||
This paper is an update on an earlier paper on Artificial Intelligence and Deep Learning in the Theory of Entropicity (ToE).<ref>Physics:Artificial Intelligence Formulated by the Theory of Entropicity(ToE). (2025, August 27). HandWiki, . Retrieved 03:59, August 27, 2025 from https://handwiki.org/wiki/index.php?title=Physics:Artificial_Intelligence_Formulated_by_the_Theory_of_Entropicity(ToE)&oldid=3742591</ref> | This paper is an update on an earlier paper on Artificial Intelligence and Deep Learning in the Theory of Entropicity (ToE).<ref>Physics:Artificial Intelligence Formulated by the Theory of Entropicity(ToE). (2025, August 27). HandWiki, . Retrieved 03:59, August 27, 2025 from https://handwiki.org/wiki/index.php?title=Physics:Artificial_Intelligence_Formulated_by_the_Theory_of_Entropicity(ToE)&oldid=3742591</ref> | ||
This second introductory paper on the Theory of Entropicity(ToE) in Artificial Intelligence renders an update on the Entropy Learning Equation (ELE) as the foundational dynamical equation governing learning in artificial intelligence systems from the perspective of the [https://doi.org/10.33774/coe-2025-n4n45 Theory of Entropicity(ToE)],<ref name="obidi_addendum">Obidi, John Onimisi. ''A Critical Review of the Theory of Entropicity (ToE) on Original Contributions, Conceptual Innovations, and Pathways towards Enhanced Mathematical Rigor: An Addendum to the Discovery of New Laws of Conservation and Uncertainty''. Cambridge University.(2025-06-30). https://doi.org/10.33774/coe-2025-hmk6n</ref> first formulated and developed by [https://doi.org/10.33774/coe-2025-hmk6n John Onimisi Obidi].<ref name="obidi2025reconcile">Obidi, John Onimisi. ''Einstein and Bohr Finally Reconciled on Quantum Theory: The Theory of Entropicity (ToE) as the Unifying Resolution to the Problem of Quantum Measurement and Wave Function Collapse''. Cambridge University. (14 April 2025). https://doi.org/10.33774/coe-2025-vrfrx</ref><ref>Obidi, John Onimisi (25 March 2025). "Attosecond Constraints on Quantum Entanglement Formation as Empirical Evidence for the Theory of Entropicity (ToE)". Cambridge University. https://doi.org/10.33774/coe-2025-30swc</ref><ref name="obidi2025starlight">Obidi, John Onimisi. ''The Theory of Entropicity (ToE) Validates Einstein’s General Relativity (GR) Prediction for Solar Starlight Deflection via an Entropic Coupling Constant η''. Cambridge University. (23 March 2025). https://doi.org/10.33774/coe-2025-1cs81</ref><ref name="obidi2025perih">Obidi, John Onimisi. ''The Theory of Entropicity (ToE): An Entropy-Driven Derivation of Mercury’s Perihelion Precession Beyond Einstein’s Curved Spacetime in General Relativity (GR)''. Cambridge University. (16 March 2025). https://doi.org/10.33774/coe-2025-g55m9</ref><ref name="obidi2025geee">Obidi, John Onimisi. ''How the Generalized Entropic Expansion Equation (GEEE) Describes the Deceleration and Acceleration of the Universe in the Absence of Dark Energy''. Cambridge University. (12 March 2025). https://doi.org/10.33774/coe-2025-6d843</ref><ref>Obidi, John Onimisi (2025). Master Equation of the Theory of Entropicity (ToE). Encyclopedia. https://encyclopedia.pub/entry/58596</ref> | |||
== Preliminaries == | == Preliminaries == |
Revision as of 23:12, 26 August 2025
Entropic Learning via Self-Referential Entropy Tracking (SRETA)
That is:
Learning is a change in internal entropy states of a system [a change in Self Referential Entropy[SRE]] towards a given [internal or external] reference entropy.
Abstract
This paper defines a principled Learning Action in which learning is explicitly a change of state guided by Self-Referential Entropy (SRE) toward a given (target) entropy. Instead of adding Shannon entropy and an “irreversible entropy” term, we construct an entropic potential and a gradient-flow action whose minimizer yields irreversible dynamics by design. The result is the Entropic Learning Equation (ELE).
This paper is an update on an earlier paper on Artificial Intelligence and Deep Learning in the Theory of Entropicity (ToE).[1]
This second introductory paper on the Theory of Entropicity(ToE) in Artificial Intelligence renders an update on the Entropy Learning Equation (ELE) as the foundational dynamical equation governing learning in artificial intelligence systems from the perspective of the Theory of Entropicity(ToE),[2] first formulated and developed by John Onimisi Obidi.[3][4][5][6][7][8]
Preliminaries
Objects and Notation
Model state (parameters): [math]\displaystyle{ \phi(t)\in\mathbb{R}^d }[/math].
Predictive distribution: [math]\displaystyle{ p_\phi(y\mid x) }[/math].
Data (or teacher) distribution: [math]\displaystyle{ p^*(y\mid x) }[/math].
Self-Referential Entropy (SRE) of the model’s internal state: [math]\displaystyle{ S_{\mathrm{self}}(\phi) }[/math] (differentiable scalar functional).
Given (target) entropy for the task: [math]\displaystyle{ S_{\mathrm{given}} }[/math].
Positive-definite mobility/metric: [math]\displaystyle{ \Gamma(\phi)\in\mathbb{R}^{d\times d} }[/math].
We use [math]\displaystyle{ \nabla_\phi }[/math] for gradients w.r.t. [math]\displaystyle{ \phi }[/math] and the dot for time derivatives, [math]\displaystyle{ \dot\phi=\tfrac{d\phi}{dt} }[/math].
Choice of Target Entropy [math]\displaystyle{ S_{\mathrm{given}} }[/math] (Guidance)
Nearly deterministic supervision: [math]\displaystyle{ S_{\mathrm{given}}\approx 0 }[/math].
Inherently ambiguous labels: set [math]\displaystyle{ S_{\mathrm{given}}\approx H^*(Y!\mid!X) }[/math], an empirical conditional entropy estimate.
Representation learning: define [math]\displaystyle{ S_{\mathrm{self}} }[/math] on latents (e.g., codebook, embedding spread) and set [math]\displaystyle{ S_{\mathrm{given}} }[/math] by desired compression/robustness.
Entropic Potential
Define the entropic potential (weights can be nondimensionalized; set to 1 after scaling): [math]\displaystyle{ W(\phi);=;\tfrac{\alpha}{2},\big(S_{\mathrm{self}}(\phi)-S_{\mathrm{given}}\big)^2 ;+; \beta,\mathbb{E}{x\sim\mathcal{D}}!\Big[\mathrm{KL}\big(p^!(\cdot\mid x),\Vert,p\phi(\cdot\mid x)\big)\Big], }[/math] where [math]\displaystyle{ \mathrm{KL}!\big(p^*(\cdot\mid x)\Vert p_\phi(\cdot\mid x)\big) \sum_y p^*(y!\mid!x),\log!\frac{p^*(y!\mid!x)}{p_\phi(y!\mid!x)}. }[/math]
The first term drives SRE alignment: [math]\displaystyle{ S_{\mathrm{self}}(\phi)\to S_{\mathrm{given}} }[/math].
The second term pulls predictions toward data without being “just another entropy addend”.
Learning Action
We penalize deviation from the desired gradient flow generated by [math]\displaystyle{ W }[/math]: [math]\displaystyle{ \mathcal{L}{\mathrm{SRETA}}[\phi] ;=; \int{0}^{T} \tfrac{1}{2}, \big| \dot\phi + \Gamma(\phi),\nabla_\phi W(\phi) \big|^2 ,dt. }[/math]
Boundary conditions (typical): [math]\displaystyle{ \phi(0)=\phi_0 }[/math]; free terminal state or fixed [math]\displaystyle{ \phi(T) }[/math] if desired.
[math]\displaystyle{ \Gamma(\phi) }[/math] sets the geometry and time scale of learning (identity [math]\displaystyle{ \to }[/math] vanilla gradient flow; Fisher metric [math]\displaystyle{ \to }[/math] natural gradient).
Stationarity and the Entropic Learning Equation (ELE)
Minimizing the action gives the gradient-flow ELE: [math]\displaystyle{ \dot\phi -,\Gamma(\phi),\nabla_\phi W(\phi) -,\Gamma(\phi)!\left[ \alpha,\big(S_{\mathrm{self}}(\phi)-S_{\mathrm{given}}\big),\nabla_\phi S_{\mathrm{self}}(\phi) ;+; \beta,\nabla_\phi, \mathbb{E}{x}!\big[\mathrm{KL}(p^*\Vert p\phi)\big] \right]. }[/math]
Built-in Irreversibility (No ad-hoc [math]\displaystyle{ S_{\mathrm{irr}} }[/math])
Define the entropy-production rate [math]\displaystyle{ \sigma(\phi) ;=; \big(\nabla_\phi W(\phi)\big)^{!\top}! \Gamma(\phi), \big(\nabla_\phi W(\phi)\big) ;\ge;0. }[/math] Because [math]\displaystyle{ \Gamma(\phi) }[/math] is positive-definite, [math]\displaystyle{ \sigma\ge 0 }[/math] holds identically: irreversibility emerges from the dynamics, not from an added penalty.
Equivalent Constrained (Tracking) Form
One can enforce “SRE relaxes toward the target” explicitly via a Lagrange multiplier [math]\displaystyle{ \lambda(t) }[/math]: [math]\displaystyle{ \mathcal{L}_{\mathrm{track}}[\phi,\lambda] \int!\Big[ \tfrac{1}{2},\dot\phi^{!\top} M,\dot\phi ;+; \beta,\mathbb{E}{x}!\big[\mathrm{KL}(p^*\Vert p\phi)\big] ;+; \lambda(t),\Big( \tfrac{d}{dt} S_{\mathrm{self}}(\phi) \kappa,[,S_{\mathrm{given}}-S_{\mathrm{self}}(\phi),] \Big) \Big],dt, }[/math] with [math]\displaystyle{ \tfrac{d}{dt} S_{\mathrm{self}}(\phi) \nabla_\phi S_{\mathrm{self}}(\phi)\cdot \dot\phi. }[/math] Stationarity yields first-order dynamics in which [math]\displaystyle{ \tfrac{d}{dt} S_{\mathrm{self}}(\phi) \kappa,[,S_{\mathrm{given}}-S_{\mathrm{self}}(\phi),] }[/math] (i.e., exponential relaxation of SRE toward the target) while simultaneously fitting the data via the KL term. The matrix [math]\displaystyle{ M\succ 0 }[/math] sets inertial weighting; taking the overdamped limit recovers the gradient-flow ELE.
Practical Design Choices
Geometry / Optimizer Mapping
Euclidean flow: [math]\displaystyle{ \Gamma(\phi)=\eta I }[/math] (learning-rate [math]\displaystyle{ \eta }[/math]), giving [math]\displaystyle{ \dot\phi=-\eta,\nabla_\phi W(\phi) }[/math].
Natural gradient: [math]\displaystyle{ \Gamma(\phi)=\eta,F(\phi)^{-1} }[/math], with [math]\displaystyle{ F }[/math] the Fisher information.
Preconditioned/Adam-like: choose [math]\displaystyle{ \Gamma }[/math] diagonal or adaptive from running curvature estimates.
Discrete-Time Approximation (for implementation)
For step size [math]\displaystyle{ \Delta t }[/math]: [math]\displaystyle{ \phi_{k+1} ;=; \phi_k \Delta t,\Gamma(\phi_k),\nabla_\phi W(\phi_k). }[/math]
Scaling and Weights
Non-dimensionalize so that [math]\displaystyle{ W }[/math] is order-one near optimum, then set [math]\displaystyle{ \alpha=\beta=1 }[/math]. Use [math]\displaystyle{ \Gamma }[/math] (or [math]\displaystyle{ \eta }[/math]) to tune time scale.
Why This Fixes the Original Formulation
Learning is explicitly state change: dynamics appear via [math]\displaystyle{ \dot\phi }[/math].
SRE alignment is the steering signal through [math]\displaystyle{ \big(S_{\mathrm{self}}-S_{\mathrm{given}}\big)\nabla_\phi S_{\mathrm{self}} }[/math].
Irreversibility ([math]\displaystyle{ \sigma\ge 0 }[/math]) is automatic from the quadratic action; no external [math]\displaystyle{ S_{\mathrm{irr}} }[/math] term is required.
Data-fit pressure is principled via KL, orthogonal to SRE alignment.
== Minimal Working Set (copy-ready) == Entropic potential:
[math]\displaystyle{ W(\phi)=\tfrac{\alpha}{2}\big(S_{\mathrm{self}}(\phi)-S_{\mathrm{given}}\big)^2 +\beta,\mathbb{E}{x}!\Big[\mathrm{KL}\big(p^*(\cdot\mid x)\Vert p\phi(\cdot\mid x)\big)\Big]. }[/math]
Learning action (SRETA): [math]\displaystyle{ \mathcal{L}{\mathrm{SRETA}}[\phi]=\int_0^T \tfrac{1}{2},\big|\dot\phi+\Gamma(\phi)\nabla\phi W(\phi)\big|^2,dt. }[/math]
Entropic Learning Equation (ELE): [math]\displaystyle{ \dot\phi -,\Gamma(\phi)!\left[ \alpha,\big(S_{\mathrm{self}}(\phi)-S_{\mathrm{given}}\big),\nabla_\phi S_{\mathrm{self}}(\phi) + \beta,\nabla_\phi\mathbb{E}{x}!\big[\mathrm{KL}(p^*\Vert p\phi)\big] \right]. }[/math]
Entropy-production rate: [math]\displaystyle{ \sigma(\phi)=\big(\nabla_\phi W(\phi)\big)^{!\top}\Gamma(\phi)\big(\nabla_\phi W(\phi)\big)\ge 0. }[/math]
Constrained tracking form (optional): [math]\displaystyle{ \mathcal{L}_{\mathrm{track}}[\phi,\lambda] \int!\Big[\tfrac{1}{2}\dot\phi^{!\top}M\dot\phi +\beta,\mathbb{E}{x}!\big[\mathrm{KL}(p^*\Vert p\phi)\big] +\lambda(t)\big(\nabla_\phi S_{\mathrm{self}}(\phi)!\cdot!\dot\phi-\kappa[S_{\mathrm{given}}-S_{\mathrm{self}}(\phi)]\big)\Big]dt. }[/math]
Remarks
This framework is agnostic to the specific definition of [math]\displaystyle{ S_{\mathrm{self}}(\phi) }[/math] as long as it is smooth; examples include entropy of latent codes, complexity penalties, or energy-based state measures.
The formulation is compatible with stochastic mini-batch estimates of both the KL and gradients.
The same blueprint extends to multi-objective training by summing additional potentials inside [math]\displaystyle{ W(\phi) }[/math] before taking the gradient flow.
References
- ↑ Physics:Artificial Intelligence Formulated by the Theory of Entropicity(ToE). (2025, August 27). HandWiki, . Retrieved 03:59, August 27, 2025 from https://handwiki.org/wiki/index.php?title=Physics:Artificial_Intelligence_Formulated_by_the_Theory_of_Entropicity(ToE)&oldid=3742591
- ↑ Obidi, John Onimisi. A Critical Review of the Theory of Entropicity (ToE) on Original Contributions, Conceptual Innovations, and Pathways towards Enhanced Mathematical Rigor: An Addendum to the Discovery of New Laws of Conservation and Uncertainty. Cambridge University.(2025-06-30). https://doi.org/10.33774/coe-2025-hmk6n
- ↑ Obidi, John Onimisi. Einstein and Bohr Finally Reconciled on Quantum Theory: The Theory of Entropicity (ToE) as the Unifying Resolution to the Problem of Quantum Measurement and Wave Function Collapse. Cambridge University. (14 April 2025). https://doi.org/10.33774/coe-2025-vrfrx
- ↑ Obidi, John Onimisi (25 March 2025). "Attosecond Constraints on Quantum Entanglement Formation as Empirical Evidence for the Theory of Entropicity (ToE)". Cambridge University. https://doi.org/10.33774/coe-2025-30swc
- ↑ Obidi, John Onimisi. The Theory of Entropicity (ToE) Validates Einstein’s General Relativity (GR) Prediction for Solar Starlight Deflection via an Entropic Coupling Constant η. Cambridge University. (23 March 2025). https://doi.org/10.33774/coe-2025-1cs81
- ↑ Obidi, John Onimisi. The Theory of Entropicity (ToE): An Entropy-Driven Derivation of Mercury’s Perihelion Precession Beyond Einstein’s Curved Spacetime in General Relativity (GR). Cambridge University. (16 March 2025). https://doi.org/10.33774/coe-2025-g55m9
- ↑ Obidi, John Onimisi. How the Generalized Entropic Expansion Equation (GEEE) Describes the Deceleration and Acceleration of the Universe in the Absence of Dark Energy. Cambridge University. (12 March 2025). https://doi.org/10.33774/coe-2025-6d843
- ↑ Obidi, John Onimisi (2025). Master Equation of the Theory of Entropicity (ToE). Encyclopedia. https://encyclopedia.pub/entry/58596