Quasi-arithmetic mean

From HandWiki
Short description: Generalization of means

In mathematics and statistics, the quasi-arithmetic mean or generalised f-mean or Kolmogorov-Nagumo-de Finetti mean[1] is one generalisation of the more familiar means such as the arithmetic mean and the geometric mean, using a function f. It is also called Kolmogorov mean after Soviet mathematician Andrey Kolmogorov. It is a broader generalization than the regular generalized mean.

Definition

If  f  is a function that maps some continuous interval  I  of the real line to some other continuous subset  Jf(I)  of the real numbers, and  f  is both continuous, and injective (one-to-one).

(We require  f  to be injective on  I  in order for an inverse function  f1  to exist. We require  I  and  J  to both be continuous intervals in order to ensure that an average of any finite (or infinite) subset of values within  J  will always correspond to a value in  I .)

Subject to those requirements, the f mean of  n  numbers  x1,,xnI  is defined to be

 Mf(x1,,xn)f1( 1n( f(x1)++f(xn) ) ) ,

or equivalently

 Mf(x)=f1( 1nk=1nf(xk) ).

A consequence of  f  being defined over some selected interval,  I , mapping to yet another interval,  J , is that  1n( f(x1)++f(xn) )  must also lie within  J . And because  J  is the domain of  f1 , so in turn  f1  must produce a value inside the same domain the values originally came from,  I.

Because  f  is injective and continuous, it necessarily follows that  f  is a strictly monotonic function, and therefore that the f mean is neither larger than the largest number of the tuple  x1, ,xnX  nor smaller than the smallest number contained in  X , hence contained somewhere among the values of the original sample.

Examples

  • If I= , the real line, and  f(x)=x , (or indeed any linear function  xax+b , for  a0 , otherwise any  a  and any  b ) then the f mean corresponds to the arithmetic mean.
  • If  I=+ , the strictly positive real numbers, and  f(x) = log(x) , then the f mean corresponds to the geometric mean. (The result is the same for any logarithm; it does not depend on the base of the logarithm, as long as that base is strictly positive but not 1.)
  • If  I=+  and  f(x) =  1 x , then the f mean corresponds to the harmonic mean.
  • If  I=+  and  f(x) = x p , then the f mean corresponds to the power mean with exponent  p  (e.g., for  p=2  one gets the root mean square (RMS).)
  • If  I=  and  f(x) = exp(x) , then the f mean is the mean in the log semiring, which is a constant-shifted version of the LogSumExp (LSE) function (which is the logarithmic sum),  Mf( x1, , xn ) = LSE( x1, , xn )log(n). (The  log(n)  in the expression corresponds to dividing by n, since logarithmic division is linear subtraction.) The LogSumExp function is a smooth maximum: It is a smooth approximation to the maximum function.

Properties

The following properties hold for  Mf  for any single function  f :

Symmetry: The value of  Mf  is unchanged if its arguments are permuted.

Idempotency: for all  x , the repeated average  Mf( x, , x )=x.

Monotonicity:  Mf  is monotonic in each of its arguments (since  f  is monotonic).

Continuity:  Mf  is continuous in each of its arguments (since  f  is continuous).

Replacement: Subsets of elements can be averaged a priori, without altering the mean, given that the multiplicity of elements is maintained. With  m  Mf( x1,  , xk )  it holds:

 Mf( x1, , xk, xk+1,  , x n ) = Mf(m,  , m k times  ,xk+1 ,  , xn).

Partitioning: The computation of the mean can be split into computations of equal sized sub-blocks:

Mf( x1, , xnk )=Mf(Mf( x1,  , xk ),Mf( xk+1,  , x2k ),,Mf( x(n1)k+1,  , xnk )).

Self-distributivity: For any quasi-arithmetic (q.a.) mean  Mq a  of two variables:

 Mq a (x, Mq a ( y, z ))=Mq a (Mq a ( x, y ),Mq a ( x, z )).

Mediality: For any quasi-arithmetic mean  Mq a  of two variables:

 Mq a (Mq a ( x, y ),Mq a ( z, w ))=Mq a (Mq a ( x, z ),Mq a ( y, w )).

Balancing: For any quasi-arithmetic mean  Mq a  of two variables:

 Mq a ( Mq a (x,Mq a ( x, y )), Mq a (y, Mq a ( x, y )) )=Mq a ( x, y ).

Scale-invariance: The quasi-arithmetic mean is invariant with respect to offsets and non-trivial scaling of quasi-arithmetic  f : For any  p(t)  a+bq(t) , with  a  and  b0  constants, and  q  a quasi-aritmetic function,  Mp( x )  and Mq( x )  are always the same. In mathematical notation:

Given  q  quasi-aritmetic, and  p : ( p(t)=a+bq(t)  t ) a b0Mp( x )=Mq( x ) x.

Central limit theorem : Under certain regularity conditions, and for a sufficiently large sample,

 zn  [Mf( X1,  , Xn )𝔼X( Mf( X1,  , Xn ) )] 

is approximately normally distributed.[2] A similar result is available for Bajraktarević means and deviation means, which are generalizations of quasi-arithmetic means.[3][4]

Characterization

There are several different sets of properties that characterize the quasi-arithmetic mean (i.e., each function that satisfies these properties is an f-mean for some function f).

  • Mediality is essentially sufficient to characterize quasi-arithmetic means.[5]: chapter 17 
  • Self-distributivity is essentially sufficient to characterize quasi-arithmetic means.[5]: chapter 17 
  • Replacement: Kolmogorov proved that the five properties of symmetry, fixed-point, monotonicity, continuity, and replacement fully characterize the quasi-arithmetic means.[6]
  • Continuity is superfluous in the characterization of two variables quasi-arithmetic means. See [10] for the details.
  • Balancing: An interesting problem is whether this condition (together with symmetry, fixed-point, monotonicity and continuity properties) implies that the mean is quasi-arithmetic. Georg Aumann showed in the 1930s that the answer is no in general,[7] but that if one additionally assumes M to be an analytic function then the answer is positive.[8]

Homogeneity

Means are usually homogeneous, but for most functions f, the f-mean is not. Indeed, the only homogeneous quasi-arithmetic means are the power means (including the geometric mean); see Hardy–Littlewood–Pólya, page 68.

The homogeneity property can be achieved by normalizing the input values by some (homogeneous) mean C.

Mf,Cx=Cxf1(f(x1Cx)++f(xnCx)n)

However this modification may violate monotonicity and the partitioning property of the mean.

Generalizations

Consider a Legendre-type strictly convex function F. Then the gradient map F is globally invertible and the weighted multivariate quasi-arithmetic mean[9] is defined by MF(θ1,,θn;w)=F1(i=1nwiF(θi)), where w is a normalized weight vector (wi=1n by default for a balanced average). From the convex duality, we get a dual quasi-arithmetic mean MF* associated to the quasi-arithmetic mean MF. For example, take F(X)=logdet(X) for X a symmetric positive-definite matrix. The pair of matrix quasi-arithmetic means yields the matrix harmonic mean: MF(θ1,θ2)=2(θ11+θ21)1.

See also

References

  • Andrey Kolmogorov (1930) "On the Notion of Mean", in "Mathematics and Mechanics" (Kluwer 1991) — pp. 144–146.
  • Andrey Kolmogorov (1930) Sur la notion de la moyenne. Atti Accad. Naz. Lincei 12, pp. 388–391.
  • John Bibby (1974) "Axiomatisations of the average and a further generalisation of monotonic sequences," Glasgow Mathematical Journal, vol. 15, pp. 63–65.
  • Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) Inequalities. 2nd ed. Cambridge Univ. Press, Cambridge, 1952.
  • B. De Finetti, "Sul concetto di media", vol. 3, p. 36996, 1931, istituto italiano degli attuari.
  1. Nielsen, Frank; Nock, Richard (June 2017). "Generalizing skew Jensen divergences and Bregman divergences with comparative convexity". IEEE Signal Processing Letters 24 (8): 2. doi:10.1109/LSP.2017.2712195. Bibcode2017ISPL...24.1123N. 
  2. de Carvalho, Miguel (2016). "Mean, what do you mean?". The American Statistician 70 (3): 764‒776. doi:10.1080/00031305.2016.1148632. https://zenodo.org/record/895400. 
  3. Barczy, Mátyás; Burai, Pál (April 2022). "Limit theorems for Bajraktarević and Cauchy quotient means of independent identically distributed random variables". Aequationes Mathematicae 96 (2): 279–305. doi:10.1007/s00010-021-00813-x. ISSN 1420-8903. https://link.springer.com/article/10.1007/s00010-021-00813-x. 
  4. Barczy, Mátyás; Páles, Zsolt (September 2023). "Limit theorems for deviation means of independent and identically distributed random variables". Journal of Theoretical Probability 36 (3): 1626–1666. doi:10.1007/s10959-022-01225-6. ISSN 1572-9230. https://link.springer.com/article/10.1007/s10959-022-01225-6. 
  5. 5.0 5.1 Aczél, J.; Dhombres, J. G. (1989). Functional equations in several variables. With applications to mathematics, information theory and to the natural and social sciences. Encyclopedia of Mathematics and its Applications, 31.. Cambridge: Cambridge Univ. Press. 
  6. Grudkin, Anton (2019). "Characterization of the quasi-arithmetic mean". https://math.stackexchange.com/a/3261514/29780. 
  7. Aumann, Georg (1937). "Vollkommene Funktionalmittel und gewisse Kegelschnitteigenschaften". Journal für die reine und angewandte Mathematik 1937 (176): 49–55. doi:10.1515/crll.1937.176.49. 
  8. Aumann, Georg (1934). "Grundlegung der Theorie der analytischen Analytische Mittelwerte". Sitzungsberichte der Bayerischen Akademie der Wissenschaften: 45–81. 
  9. Nielsen, Frank (2023). "Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry". arXiv:2301.10980 [cs.IT].