Nearest neighbour distribution

From HandWiki

In probability and statistics, a nearest neighbor function, nearest neighbor distance distribution,[1] nearest-neighbor distribution function[2] or nearest neighbor distribution[3] is a mathematical function that is defined in relation to mathematical objects known as point processes, which are often used as mathematical models of physical phenomena representable as randomly positioned points in time, space or both.[4][5] More specifically, nearest neighbor functions are defined with respect to some point in the point process as being the probability distribution of the distance from this point to its nearest neighboring point in the same point process, hence they are used to describe the probability of another point existing within some distance of a point. A nearest neighbor function can be contrasted with a spherical contact distribution function, which is not defined in reference to some initial point but rather as the probability distribution of the radius of a sphere when it first encounters or makes contact with a point of a point process.

Nearest neighbor function are used in the study of point processes[1][5][6] as well as the related fields of stochastic geometry[4] and spatial statistics,[1][7] which are applied in various scientific and engineering disciplines such as biology, geology, physics, and telecommunications.[4][5][8][9]

Point process notation

Main page: Point process notation

Point processes are mathematical objects that are defined on some underlying mathematical space. Since these processes are often used to represent collections of points randomly scattered in space, time or both, the underlying space is usually d-dimensional Euclidean space denoted here by [math]\displaystyle{ \textstyle \textbf{R}^{ d} }[/math], but they can be defined on more abstract mathematical spaces.[6]

Point processes have a number of interpretations, which is reflected by the various types of point process notation.[4][9] For example, if a point [math]\displaystyle{ \textstyle x }[/math] belongs to or is a member of a point process, denoted by [math]\displaystyle{ \textstyle {N} }[/math], then this can be written as:[4]

[math]\displaystyle{ \textstyle x\in {N}, }[/math]

and represents the point process being interpreted as a random set. Alternatively, the number of points of [math]\displaystyle{ \textstyle {N} }[/math] located in some Borel set [math]\displaystyle{ \textstyle B }[/math] is often written as:[8][4][7]

[math]\displaystyle{ \textstyle {N}(B), }[/math]

which reflects a random measure interpretation for point processes. These two notations are often used in parallel or interchangeably.[4][7][8]

Definitions

Nearest neighbor function

The nearest neighbor function, as opposed to the spherical contact distribution function, is defined in relation to some point of a point process already existing in some region of space. More precisely, for some point in the point process [math]\displaystyle{ \textstyle {N} }[/math], the nearest neighbor function is the probability distribution of the distance from that point to the nearest or closest neighboring point.

To define this function for a point located in [math]\displaystyle{ \textstyle \textbf{R}^d }[/math] at, for example, the origin [math]\displaystyle{ \textstyle o }[/math], the [math]\displaystyle{ \textstyle d }[/math]-dimensional ball [math]\displaystyle{ \textstyle b(o,r) }[/math] of radius [math]\displaystyle{ \textstyle r }[/math] centered at the origin o is considered. Given a point of [math]\displaystyle{ \textstyle {N} }[/math] existing at [math]\displaystyle{ \textstyle o }[/math], then the nearest neighbor function is defined as:[4]

[math]\displaystyle{ D_o(r)=1-P({N}(b(o,r))=1\mid o). }[/math]

where [math]\displaystyle{ \textstyle P({N}(b(o,r))=1\mid o) }[/math] denotes the conditional probability that there is one point of [math]\displaystyle{ \textstyle {N} }[/math] located in [math]\displaystyle{ \textstyle b(o,r) }[/math] given there is a point of [math]\displaystyle{ \textstyle {N} }[/math] located at [math]\displaystyle{ \textstyle o }[/math].

The reference point need not be at the origin, and can be located at an arbitrary point [math]\displaystyle{ \textstyle x\in\textbf{R}^d }[/math]. Given a point of [math]\displaystyle{ \textstyle {N} }[/math] existing at [math]\displaystyle{ \textstyle x }[/math], then the nearest neighbor function, is defined as:

[math]\displaystyle{ D_x(r)=1-P({N}(b(x,r))=1\mid x). }[/math]

Examples

Mathematical expressions of the nearest neighbor distribution only exist for a few point processes.

Poisson point process

For a Poisson point process [math]\displaystyle{ \textstyle {N} }[/math] on [math]\displaystyle{ \textstyle \textbf{R}^d }[/math] with intensity measure [math]\displaystyle{ \textstyle \Lambda }[/math] the nearest neighbor function is:

[math]\displaystyle{ D_x(r)=1-e^{-\Lambda(b(x,r))}, }[/math]

which for the homogeneous case becomes

[math]\displaystyle{ D_x(r)=1-e^{-\lambda |b(x,r)|}, }[/math]

where [math]\displaystyle{ \textstyle |b(x,r)| }[/math] denotes the volume (or more specifically, the Lebesgue measure) of the (hyper) ball of radius [math]\displaystyle{ \textstyle r }[/math]. In the plane [math]\displaystyle{ \textstyle \textbf{R}^2 }[/math] with the reference point located at the origin, this becomes

[math]\displaystyle{ D_x(r)=1-e^{-\lambda \pi r^2}. }[/math]

Relationship to other functions

Spherical contact distribution function

In general, the spherical contact distribution function and the corresponding nearest neighbor function are not equal. However, these two functions are identical for Poisson point processes.[4] In fact, this characteristic is due to a unique property of Poisson processes and their Palm distributions, which forms part of the result known as the Slivnyak–Mecke[8] or Slivnyak's theorem.[1]

J-function

The fact that the spherical distribution function Hs(r) and nearest neighbor function Do(r) are identical for the Poisson point process can be used to statistically test if point process data appears to be that of a Poisson point process. For example, in spatial statistics the J-function is defined for all r ≥ 0 as:[4]

[math]\displaystyle{ J(r)=\frac{1-D_o(r)}{1-H_s(r)} }[/math]

For a Poisson point process, the J function is simply J(r) = 1, hence why it is used as a non-parametric test for whether data behaves as though it were from a Poisson process. It is, however, thought possible to construct non-Poisson point processes for which J(r) = 1,[10] but such counterexamples are viewed as somewhat 'artificial' by some and exist for other statistical tests.[11]

More generally, J-function serves as one way (others include using factorial moment measures[1]) to measure the interaction between points in a point process.[4]

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 A. Baddeley, I. Bárány, and R. Schneider. Spatial point processes and their applications. Stochastic Geometry: Lectures given at the CIME Summer School held in Martina Franca, Italy, September 13–18, 2004, pages 1–75, 2007.
  2. Torquato, S, Lu, B, Rubinstein, J (1990). "Nearest-neighbor distribution function for systems on interacting particles". Journal of Physics A: Mathematical and General 23 (3): L103–L107. doi:10.1088/0305-4470/23/3/005. Bibcode1990JPhA...23L.103T. 
  3. Doguwa, Sani I (1992). "On the estimation of the point-object nearest neighbor distribution F (y) for point processes". Journal of Statistical Computation and Simulation 41 (1–2): 95–107. doi:10.1080/00949659208811393. 
  4. 4.00 4.01 4.02 4.03 4.04 4.05 4.06 4.07 4.08 4.09 4.10 D. Stoyan, W. S. Kendall, J. Mecke, and L. Ruschendorf. Stochastic geometry and its applications, volume 2. Wiley Chichester, 1995.
  5. 5.0 5.1 5.2 D. J. Daley and D. Vere-Jones. An introduction to the theory of point processes. Vol. I. Probability and its Applications (New York). Springer, New York, second edition, 2003.
  6. 6.0 6.1 D. J. Daley and D. Vere-Jones. An introduction to the theory of point processes. Vol. {II}. Probability and its Applications (New York). Springer, New York, second edition, 2008.
  7. 7.0 7.1 7.2 J. Moller and R. P. Waagepetersen. Statistical inference and simulation for spatial point processes. CRC Press, 2003. [1]
  8. 8.0 8.1 8.2 8.3 F. Baccelli and B. Błaszczyszyn. Stochastic Geometry and Wireless Networks, Volume I – Theory, volume 3, No 3-4 of Foundations and Trends in Networking. NoW Publishers, 2009. [2]
  9. 9.0 9.1 F. Baccelli and B. Błaszczyszyn. Stochastic Geometry and Wireless Networks, Volume II – Applications, volume 4, No 1-2 of Foundations and Trends in Networking. NoW Publishers, 2009.
  10. Bedford, T, Van den Berg, J (1997). "A remark on the Van Lieshout and Baddeley J-function for point processes". Advances in Applied Probability 29 (1): 19–25. doi:10.2307/1427858. https://ir.cwi.nl/pub/1373. 
  11. Foxall, Rob, Baddeley, Adrian (2002). "Nonparametric measures of association between a spatial point process and a random set, with geological applications". Journal of the Royal Statistical Society, Series C 51 (2): 165–182. doi:10.1111/1467-9876.00261.