Minkowski distance

From HandWiki
Short description: Vector distance using pth powers

The Minkowski distance or Minkowski metric is a metric in a normed vector space which can be considered as a generalization of both the Euclidean distance and the Manhattan distance. It is named after the mathematician Hermann Minkowski.

Comparison of Chebyshev, Euclidean and taxicab/Manhattan distances for the hypotenuse of a 3-4-5 triangle on a chessboard (represented by a king, an ant, and a wazir). The wazir moves like a rook but only one square at a time.

Definition

The Minkowski distance of order p (where p is an integer) between two points X=(x1,x2,,xn) and Y=(y1,y2,,yn)n is defined as: D(X,Y)=(i=1n|xiyi|p)1p.

For p1, the Minkowski distance is a metric as a result of the Minkowski inequality.[1] When p<1, the distance between (0,0) and (1,1) is 21/p>2, but the point (0,1) is at a distance 1 from both of these points. Since this violates the triangle inequality, for p<1 it is not a metric. However, a metric can be obtained for these values by simply removing the exponent of 1/p. The resulting metric is also an F-norm.

Minkowski distance is typically used with p being 1 or 2, which correspond to the Manhattan distance and the Euclidean distance, respectively.[2] In the limiting case of p reaching infinity, we obtain the Chebyshev distance: limp(i=1n|xiyi|p)1p=maxi=1n|xiyi|.

Similarly, for p reaching negative infinity, we have: limp(i=1n|xiyi|p)1p=mini=1n|xiyi|.

The Minkowski distance can also be viewed as a multiple of the power mean of the component-wise differences between P and Q.

The following figure shows unit circles (the level set of the distance function where all points are at the unit distance from the center) with various values of p:

Unit circles using different Minkowski distance metrics.

Applications

The Minkowski metric is very useful in the field of machine learning and AI.[citation needed] Many popular machine learning algorithms use specific distance metrics such as the aforementioned to compare the similarity of two data points. Depending on the nature of the data being analyzed, various metrics can be used. The Minkowski metric is most useful for numerical datasets where one wants to determine the similarity of size between multiple datapoint vectors.

See also

  • Generalized mean – N-th root of the arithmetic mean of the given numbers raised to the power n
  • Lp space – Function spaces generalizing finite-dimensional p norm spaces
  • Norm (mathematics) – Length in a vector space

References

  1. Şuhubi, Erdoğan S. (2003), "Chapter V: Metric Spaces", Functional Analysis, Springer Netherlands, pp. 261–356, doi:10.1007/978-94-017-0141-9_5, ISBN 9789401701419 
  2. Zezula, Pavel; Amato, Giuseppe; Dohnal, Vlastislav; Batko, Michal (2006), "Chapter 1, Foundations of Metric Space Searching, Section 3.1, Minkowski Distances", Similarity Search: The Metric Space Approach, Advances in Database Systems, Springer, p. 10, doi:10.1007/0-387-29151-2, ISBN 9780387291512 

Template:Lp spaces