Variogram: Difference between revisions

From HandWiki
imported>Steve Marsio
linkage
 
OrgMain (talk | contribs)
linkage
 
Line 1: Line 1:
{{Short description|Spatial statistics function}}
A '''variogram''' is the graphical representation of the [[Spatial dependence|spatial dependence]] between pairs of data points, commonly used in [[Earth:Geostatistics|geostatistics]] and [[Spatial statistics|spatial statistics]]. The term is sometimes used synonymously with '''semivariogram''', but the latter is also used by some authors to refer to half of a variogram, and should therefore be avoided.<ref name=":1">{{Cite journal |last1=Bachmaier |first1=Martin |last2=Backes |first2=Matthias |date=2011-08-30 |title=Variogram or Semivariogram? Variance or Semivariance? Allan Variance or Introducing a New Term? |url=https://www.researchgate.net/publication/227307628_Variogram_or_Semivariogram_Variance_or_Semivariance_Allan_Variance_or_Introducing_a_New_Term |journal=Mathematical Geosciences |language=en |volume=43 |issue=6 |pages=735–740 |doi=10.1007/s11004-011-9348-3 |issn=1874-8961}}</ref> Likewise, the term ''semivariance'' can be misleading, since the values shown in a variogram are entire [[Variance|variances]] of observations at a given spatial separation (lag).<ref name=":1" />
In [[Spatial statistics|spatial statistics]] the theoretical '''variogram''', denoted  <math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)</math>, is a function describing the degree of [[Spatial dependence|spatial dependence]] of a spatial [[Random field|random field]] or [[Stochastic process|stochastic process]] <math>Z(\mathbf{s})</math>. The '''semivariogram''' <math>\gamma(\mathbf{s}_1,\mathbf{s}_2)</math> is half the variogram.  


[[File:Schematic variogram.svg|thumb|Schematisation of a variogram. The points represent the measured data points (observed) and the curve represents the model function used (empirical). Range stands for the range sought, sill for the plateau value reached at maximum range, nugget for the nugget effect.]]
The variogram is the key function in geostatistics as it will be used to fit a model of the temporal/[[Spatial correlation|spatial correlation]] of the observed phenomenon. One is thus making a distinction between the ''experimental variogram'' that is a visualization of a possible spatial/temporal correlation and the ''variogram model'' that is further used to define the weights of the [[Kriging|kriging]] function. Note that the experimental variogram is an empirical estimate of the [[Covariance|covariance]] of a [[Gaussian process]]. As such, it may not be positive definite and hence not directly usable in kriging, without constraints or further processing. This explains why only a limited number of variogram models are used: most commonly, the linear, the spherical, the Gaussian, and the exponential models.


In the case of a concrete example from the field of [[Earth:Gold mining|gold mining]], a variogram will give a measure of how much two samples taken from the mining area will vary in gold percentage depending on the distance between those samples. Samples taken far apart will vary more than samples taken close to each other.
For example, in [[Earth:Gold mining|gold mining]], a variogram will give a measure of how much two samples taken from the mining area will vary in gold percentage depending on the distance between those samples. Samples taken far apart will vary more than samples taken close to each other.


==Definition==
==Definition==
{{redirect|Semivariance|the measure of downside risk|Variance#Semivariance}}
{{anchor|Semivariance}}
{{anchor|Semivariance}}


<!-- Is h a real number or a vector? Sounds like a real number, but if so, then we are missing another integral in the expression below, because M + h is not defined. It should be integrating over a sphere of radius h. -->The '''semivariogram''' <math>\gamma(h)</math> was first defined by Matheron (1963) as half the average squared difference between the values at points (<math>\mathbf{s}_1</math> and <math>\mathbf{s}_2</math>) separated at distance <math>h</math>.<ref name="Matheron1963">{{cite journal|last1=Matheron|first1=Georges|title=Principles of geostatistics|journal=Economic Geology|volume=58|issue=8|year=1963|pages=1246–1266|issn=1554-0774|doi=10.2113/gsecongeo.58.8.1246}}</ref><ref>{{cite web |url=http://www.faculty.washington.edu/edford/Variogram.pdf |title=The Empirical Variogram |last=Ford |first=David |website=faculty.washington.edu/edford |access-date=31 October 2017 }}</ref> Formally
<!-- Is h a real number or a vector? Sounds like a real number, but if so, then we are missing another integral in the expression below, because M + h is not defined. It should be integrating over a sphere of radius h. -->The semivariogram <math>\gamma(h)</math> was first defined by Matheron (1963) as half the average squared difference between a function and a translated copy of the function separated at distance <math>h</math>.<ref name="Matheron1963">{{cite journal|last1=Matheron|first1=Georges|title=Principles of geostatistics|journal=Economic Geology|volume=58|issue=8|year=1963|pages=1246–1266|issn=1554-0774|doi=10.2113/gsecongeo.58.8.1246 |bibcode=1963EcGeo..58.1246M }}</ref><ref>{{cite web |url=http://www.faculty.washington.edu/edford/Variogram.pdf |title=The Empirical Variogram |last=Ford |first=David |website=faculty.washington.edu/edford |access-date=31 October 2017 }}</ref> Formally


:<math>\gamma(h)=\frac{1}{2V}\iiint_V \left[f(M+h) - f(M) \right]^2dV, </math>
:<math>\gamma(h)=\frac{1}{2}\iiint_V \left[f(M+h) - f(M) \right]^2dM, </math>


where <math>M</math> is a point in the geometric field <math>V</math>, and <math>f(M)</math> is the value at that point. The triple integral is over 3 dimensions. <math>h</math> is the separation distance (e.g., in meters or km) of interest.  
where <math>M</math> is a point in the geometric field <math>V</math>, and <math>f(M)</math> is the value at that point. The triple integral is over 3 dimensions. <math>h</math> is the separation distance (e.g., in meters or km) of interest.  
Line 17: Line 17:
To obtain the semivariogram for a given <math>\gamma(h)</math>, all pairs of points at that exact distance would be sampled. In practice it is impossible to sample everywhere, so the [[Variogram#Empirical variogram|empirical variogram]] is used instead.
To obtain the semivariogram for a given <math>\gamma(h)</math>, all pairs of points at that exact distance would be sampled. In practice it is impossible to sample everywhere, so the [[Variogram#Empirical variogram|empirical variogram]] is used instead.


The variogram is twice the semivariogram and can be defined, equivalently, as the [[Variance|variance]] of the difference between field values at two locations (<math>\mathbf{s}_1</math> and <math>\mathbf{s}_2</math>, note change of notation from <math>M</math> to <math>\mathbf{s}</math> and <math>f</math> to <math>Z</math>) across realizations of the field (Cressie 1993):
The variogram is twice the semivariogram and can be defined, differently, as the [[Variance|variance]] of the difference between field values at two locations (<math>\mathbf{s}_1</math> and <math>\mathbf{s}_2</math>, note change of notation from <math>M</math> to <math>\mathbf{s}</math> and <math>f</math> to <math>Z</math>) across realizations of the field (Cressie 1993):


:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=\text{var}\left(Z(\mathbf{s}_1) - Z(\mathbf{s}_2)\right) = E\left[((Z(\mathbf{s}_1)-\mu(\mathbf{s}_1))-(Z(\mathbf{s}_2) - \mu(\mathbf{s}_2)))^2\right]. </math>
:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=\text{var}\left(Z(\mathbf{s}_1) - Z(\mathbf{s}_2)\right) = E\left[((Z(\mathbf{s}_1)-Z(\mathbf{s}_2)) - E[Z(\mathbf{s}_1) - Z(\mathbf{s}_2)] )^2\right]. </math>


If the spatial random field has constant mean <math>\mu</math>, this is equivalent to the expectation for the squared increment of the values between locations <math>\mathbf{s}_1</math> and <math>s_2</math> (Wackernagel 2003) (where <math>\mathbf{s}_1</math> and <math>\mathbf{s}_2</math> are points in space and possibly time):
If the spatial [[Random field|random field]] has constant mean <math>\mu</math>, this is equivalent to the expectation for the squared increment of the values between locations <math>\mathbf{s}_1</math> and <math>s_2</math> (Wackernagel 2003) (where <math>\mathbf{s}_1</math> and <math>\mathbf{s}_2</math> are points in space and possibly time):


:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=E\left[\left(Z(\mathbf{s}_1)-Z(\mathbf{s}_2)\right)^2\right] . </math>
:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=E\left[\left(Z(\mathbf{s}_1)-Z(\mathbf{s}_2)\right)^2\right] . </math>
Line 33: Line 33:
:<math>\gamma(\mathbf{s}_1,\mathbf{s}_2)=\gamma_i(h).</math>
:<math>\gamma(\mathbf{s}_1,\mathbf{s}_2)=\gamma_i(h).</math>


The indexes <math>i</math> or <math>s</math> are typically not written. The terms are used for all three forms of the function. Moreover, the term "variogram" is sometimes used to denote the semivariogram, and the symbol <math>\gamma</math> is sometimes used for the variogram, which brings some confusion.<ref>{{cite journal | last1=Bachmaier | first1=Martin | last2=Backes | first2=Matthias | title=Variogram or semivariogram? Understanding the variances in a variogram | journal=Precision Agriculture | publisher=Springer Science and Business Media LLC | volume=9 | issue=3 | date=2008-02-24 | issn=1385-2256 | doi=10.1007/s11119-008-9056-2 | pages=173–175}}</ref>
The indexes <math>i</math> or <math>s</math> are typically not written. The terms are used for all three forms of the function. Moreover, the term "variogram" is sometimes used to denote the semivariogram, and the symbol <math>\gamma</math> is sometimes used for the variogram, which brings some confusion.<ref>{{cite journal | last1=Bachmaier | first1=Martin | last2=Backes | first2=Matthias | title=Variogram or semivariogram? Understanding the variances in a variogram | journal=Precision Agriculture | publisher=Springer Science and Business Media LLC | volume=9 | issue=3 | date=2008-02-24 | issn=1385-2256 | doi=10.1007/s11119-008-9056-2 | pages=173–175 | bibcode=2008PrAgr...9..173B }}</ref>


==Properties==
==Properties==
According to (Cressie 1993, Chiles and Delfiner 1999, Wackernagel 2003) the theoretical variogram has the following properties:
According to (Cressie 1993, Chiles and Delfiner 1999, Wackernagel 2003) the theoretical variogram has the following properties:
* The semivariogram is nonnegative <math>\gamma(\mathbf{s}_1,\mathbf{s}_2)\geq 0</math>, since it is the expectation of a square.
* The semivariogram is nonnegative <math>\gamma(\mathbf{s}_1,\mathbf{s}_2)\geq 0</math>, since it is the expectation of a square.
* The semivariogram <math>\gamma(\mathbf{s}_1,\mathbf{s}_1)=\gamma_i(0)=E\left((Z(\mathbf{s}_1)-Z(\mathbf{s}_1))^2\right)=0</math> at distance 0 is always 0, since <math>Z(\mathbf{s}_1)-Z(\mathbf{s}_1)=0</math>.
* The semivariogram <math>\gamma(\mathbf{s}_1,\mathbf{s}_1)=\gamma_i(0)=E\left((Z(\mathbf{s}_1)-Z(\mathbf{s}_1))^2\right)=0</math> at distance 0 is always 0, since <math>Z(\mathbf{s}_1)-Z(\mathbf{s}_1)=0</math>.
* A function is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights <math>w_1,\ldots,w_N</math> subject to <math>\sum_{i=1}^N w_i=0</math> and locations <math>s_1,\ldots,s_N</math> it holds:
* A function is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights <math>w_1,\ldots,w_N</math> subject to <math>\sum_{i=1}^N w_i=0</math> and locations <math>s_1,\ldots,s_N</math> it holds:
::<math>\sum_{i=1}^N\sum_{j=1}^N w_{i}\gamma(\mathbf{s}_i,\mathbf{s}_j)w_j \leq 0</math>
:<math>\sum_{i=1}^N\sum_{j=1}^N w_{i}\gamma(\mathbf{s}_i,\mathbf{s}_j)w_j \leq 0 ,</math>
: which corresponds to the fact that the variance <math>var(X)</math> of <math>X=\sum_{i=1}^N w_i Z(x_i)</math> is given by the negative of this double sum and must be nonnegative.{{disputed inline|reason=This statement appears to be false|date=July 2016}}
which corresponds to the fact that the variance <math>\operatorname{var}(X)</math> of <math>X=\sum_{i=1}^N w_i Z(x_i)</math> is given by the negative of this double sum and must be nonnegative.{{disputed inline|reason=This statement appears to be false|date=July 2016}}
* If the [[Covariance function|covariance function]] of a stationary process exists it is related to variogram by<blockquote><math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=C(\mathbf{s}_1,\mathbf{s}_1)+C(\mathbf{s}_2,\mathbf{s}_2)-2C(\mathbf{s}_1,\mathbf{s}_2)</math></blockquote>
* If the [[Covariance function|covariance function]] ''C'' of a stationary process exists, it is related to variogram by
* If a stationary random field has no spatial dependence (i.e. <math>C(h)=0</math> if <math>h\not= 0</math>), the semivariogram is the constant <math>var(Z(\mathbf{s}))</math> everywhere except at the origin, where it is zero.
:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=C(\mathbf{s}_1,\mathbf{s}_1)+C(\mathbf{s}_2,\mathbf{s}_2)-2C(\mathbf{s}_1,\mathbf{s}_2)</math>
* <math>\gamma(\mathbf{s}_1,\mathbf{s}_2)=E\left[|Z(\mathbf{s}_1)-Z(\mathbf{s}_2)|^2\right]=\gamma(\mathbf{s}_2,\mathbf{s}_1)</math> is a symmetric function.
* If the [[Variance|variance]] ''V'' and [[Correlation function|correlation function]] ''c'' of a stationary process exist, they are related to semivariogram by
* Consequently, <math>\gamma_s(h)=\gamma_s(-h)</math> is an even function.
:<math>\gamma(\mathbf{s}_1,\mathbf{s}_2)=V(1 - c(\mathbf{s}_1,\mathbf{s}_2))</math>
* If the random field is [[Stationary process|stationary]] and ergodic, the <math>\lim_{h\to \infty} \gamma_s(h) = var(Z(\mathbf{s}))</math> corresponds to the variance of the field. The limit of the semivariogram is also called its ''sill''.
* Conversely, the covariance function ''C'' of a stationary process can be obtained from the semivariogram and variance as
:<math>C(\mathbf{s}_1,\mathbf{s}_2)=V-\gamma(\mathbf{s}_1,\mathbf{s}_2)</math>
* If a stationary random field has no spatial dependence (i.e. <math>C(h)=0</math> if <math>h\not= 0</math>), the semivariogram is the constant <math>\operatorname{var}(Z(\mathbf{s}))</math> everywhere except at the origin, where it is zero.
* The semivariogram is a [[Symmetric function|symmetric function]], <math>\gamma(\mathbf{s}_1,\mathbf{s}_2)=E\left[|Z(\mathbf{s}_1)-Z(\mathbf{s}_2)|^2\right]=\gamma(\mathbf{s}_2,\mathbf{s}_1)</math>.
* Consequently, the isotropic semivariogram is an [[Even function|even function]] <math>\gamma_s(h)=\gamma_s(-h)</math>.
* If the random field is [[Stationary process|stationary]] and ergodic, the <math>\lim_{h\to \infty} \gamma_s(h) = \operatorname{var}(Z(\mathbf{s}))</math> corresponds to the variance of the field. The limit of the semivariogram with increasing distance is also called its ''sill''.
* As a consequence the semivariogram might be non continuous only at the origin. The height of the jump at the origin is sometimes referred to as ''nugget'' or nugget effect.  
* As a consequence the semivariogram might be non continuous only at the origin. The height of the jump at the origin is sometimes referred to as ''nugget'' or nugget effect.  


Line 54: Line 59:


* ''nugget'' <math>n</math>: The height of the jump of the semivariogram at the discontinuity at the origin.  
* ''nugget'' <math>n</math>: The height of the jump of the semivariogram at the discontinuity at the origin.  
* ''sill'' <math>s</math>: Limit of the variogram tending to infinity lag distances.
* ''sill'' <math>s</math>: Limit of the variogram tending to infinity lag distances.
* ''range'' <math>r</math>: The distance in which the difference of the variogram from the sill becomes negligible. In models with a fixed sill, it is the distance at which this is first reached; for models with an asymptotic sill, it is conventionally taken to be the distance when the semivariance first reaches 95% of the sill.
* ''range'' <math>r</math>: The distance in which the difference of the variogram from the sill becomes negligible. In models with a fixed sill, it is the distance at which this is first reached; for models with an asymptotic sill, it is conventionally taken to be the distance when the semivariance first reaches 95% of the sill.


==Empirical variogram==
==Empirical variogram==


Generally, an '''empirical variogram''' is needed for measured data, because sample information <math>Z</math> is not available for every location. The sample information for example could be concentration of iron in soil samples, or pixel intensity on a camera. Each piece of sample information has coordinates <math>\mathbf{s}=(x,y)</math> for a 2D sample space where <math>x</math> and <math>y</math> are geographical coordinates. In the case of the iron in soil, the sample space could be 3 dimensional. If there is temporal variability as well (e.g., phosphorus content in a lake) then <math>\mathbf{s}</math> could be a 4 dimensional vector <math>(x,y,z,t)</math>. For the case where dimensions have different units (e.g., distance and time) then a scaling factor <math>B</math> can be applied to each to obtain a modified Euclidean distance.<ref name="Nguyen2014">{{cite journal|last1=Nguyen|first1=H.|last2=Osterman|first2=G.|last3=Wunch|first3=D.|last4=O'Dell|first4=C.|last5=Mandrake|first5=L.|last6=Wennberg|first6=P.|last7=Fisher|first7=B.|last8=Castano|first8=R.|title=A method for colocating satellite ''X''<sub>CO<sub>2</sub></sub> data to ground-based data and its application to ACOS-GOSAT and TCCON|journal=Atmospheric Measurement Techniques|volume=7|issue=8|year=2014|pages=2631–2644|issn=1867-8548|doi=10.5194/amt-7-2631-2014|bibcode=2014AMT.....7.2631N|doi-access=free}}</ref>
Generally, an empirical variogram is needed for measured data, because sample information <math>Z</math> is not available for every location. The sample information for example could be concentration of iron in soil samples, or pixel intensity on a camera. Each piece of sample information has coordinates <math>\mathbf{s}=(x,y)</math> for a 2D [[Sample space|sample space]] where <math>x</math> and <math>y</math> are geographical coordinates. In the case of the iron in soil, the sample space could be 3 dimensional. If there is temporal variability as well (e.g., phosphorus content in a lake) then <math>\mathbf{s}</math> could be a 4 dimensional vector <math>(x,y,z,t)</math>. For the case where dimensions have different units (e.g., distance and time) then a scaling factor <math>B</math> can be applied to each to obtain a modified [[Euclidean distance]].<ref name="Nguyen2014">{{cite journal|last1=Nguyen|first1=H.|last2=Osterman|first2=G.|last3=Wunch|first3=D.|last4=O'Dell|first4=C.|last5=Mandrake|first5=L.|last6=Wennberg|first6=P.|last7=Fisher|first7=B.|last8=Castano|first8=R.|title=A method for colocating satellite ''X''<sub>CO<sub>2</sub></sub> data to ground-based data and its application to ACOS-GOSAT and TCCON|journal=Atmospheric Measurement Techniques|volume=7|issue=8|year=2014|pages=2631–2644|issn=1867-8548|doi=10.5194/amt-7-2631-2014|bibcode=2014AMT.....7.2631N|doi-access=free}}</ref>


Sample observations are denoted <math>Z(\mathbf{s}_i)=z_i</math>. Samples may be taken at <math>k</math> total different locations. This would provide as set of samples <math>z_1,\ldots,z_k</math> at locations <math>\mathbf{s}_1,\ldots,\mathbf{s}_k</math>. Generally, plots show the semivariogram values as a function of sample point separation <math>h</math>. In the case of empirical semivariogram, separation distance bins <math>h \pm \delta</math> are used rather than exact distances, and usually isotropic conditions are assumed (i.e., that <math>\gamma</math> is only a function of <math>h</math> and does not depend on other variables such as center position). Then, the empirical semivariogram <math>\hat{\gamma}(h \pm \delta)</math> can be calculated for each bin:
Sample observations are denoted <math>Z(\mathbf{s}_i)=z_i</math>. Observations may be taken at <math>M</math> total different locations (the sample size). This would provide as set of observations <math>z_1,\ldots,z_M</math> at locations <math>\mathbf{s}_1,\ldots,\mathbf{s}_M</math>. Generally, plots show the semivariogram values as a function of separation distance <math>h_k</math> for multiple steps <math>k=1,\ldots</math>. In the case of empirical semivariogram, separation distance interval <math>h_k \pm \delta</math> is used rather than exact distances, and usually isotropic conditions are assumed (i.e., that <math>\gamma</math> is only a function of <math>h</math> and does not depend on other variables such as center position). Then, the empirical semivariogram <math>\hat{\gamma}(h \pm \delta)</math> can be calculated for each [[Data binning|bin]]:


:<math>\hat{\gamma}(h \pm \delta):=\frac{1}{2|N(h \pm \delta)|}\sum_{(i,j)\in N(h \pm \delta)} |z_i-z_j|^2</math>
:<math>\hat{\gamma}(h_k \pm \delta):=\frac{1}{2N_k}\sum_{(i,j)\in S_k} |z_i-z_j|^2</math>


Or in other words, each pair of points separated by <math>h</math> (plus or minus some bin width tolerance range <math>\delta</math>) are found. These form the set of points <math>N(h \pm \delta) \equiv \{ (\mathbf{s}_i,\mathbf{s}_j): |\mathbf{s}_i,\mathbf{s}_j| = h \pm \delta; i,j=1,\ldots,N \}</math>. The number of these points in this bin is <math>|N(h \pm \delta)|</math>. Then for each pair of points <math>i,j</math>, the square of the difference in the observation (e.g., soil sample content or pixel intensity) is found (<math>|z_i-z_j|^2</math>). These squared differences are added together and normalized by the natural number <math>|N(h \pm \delta)|</math>. By definition the result is divided by 2 for the semivariogram at this separation.
Or in other words, each pair of points separated by <math>h_k</math> (plus or minus some bin width tolerance range <math>\delta</math>) are found. These form the set of points
:<math>S_k=S(h_k \pm \delta) \equiv \{ (\mathbf{s}_i,\mathbf{s}_j): h_k-\delta < |\mathbf{s}_i-\mathbf{s}_j| < h_k + \delta; i,j=1,\ldots,M \}</math>
The number of these points in this bin is <math>N_k=|S_k|</math> (the set size). Then for each pair of points <math>i,j</math>, the square of the difference in the observation (e.g., soil sample content or pixel intensity) is found (<math>|z_i-z_j|^2</math>). These squared differences are added together and normalized by the natural number <math>N_k</math>. By definition the result is divided by 2 for the semivariogram at this separation.


For computational speed, only the unique pairs of points are needed. For example, for 2 observations pairs [<math>(z_a,z_b),(z_c,z_d)</math>] taken from locations with separation <math>h \pm \delta</math> only [<math>(z_a,z_b),(z_c,z_d)</math>] need to be considered, as the pairs [<math>(z_b,z_a),(z_d,z_c)</math>] do not provide any additional information.
For computational speed, only the unique pairs of points are needed. For example, for 2 observations pairs [<math>(z_a,z_b),(z_c,z_d)</math>] taken from locations with separation <math>h \pm \delta</math> only [<math>(z_a,z_b),(z_c,z_d)</math>] need to be considered, as the pairs [<math>(z_b,z_a),(z_d,z_c)</math>] do not provide any additional information.


==Variogram models==
==Variogram models==
 
[[File:Variogram Models.png|thumb|Typical semivariogram functions in kriging.<ref>{{Cite journal |last1=Ding |first1=Qile |last2=Wang |first2=Yiren |last3=Zheng |first3=Yu |last4=Wang |first4=Fengyang |last5=Zhou |first5=Shudong |last6=Pan |first6=Donghui |last7=Xiong |first7=Yuchun |last8=Zhang |first8=Yi |date=2024-12-05 |title=Subsurface Geological Profile Interpolation Using a Fractional Kriging Method Enhanced by Random Forest Regression |url= |journal=Fractal and Fractional |language=en |volume=8 |issue=12 |pages=717 |doi=10.3390/fractalfract8120717 |doi-access=free |issn=2504-3110}}</ref>]]
The empirical variogram cannot be computed at every lag distance <math>h</math> and due to variation in the estimation it is not ensured that it is a valid variogram, as defined above. However some [[Earth:Geostatistics|Geostatistical]] methods such as [[Kriging|kriging]] need valid semivariograms. In applied geostatistics the empirical variograms are thus often approximated by model function ensuring validity (Chiles&Delfiner 1999). Some important models are (Chiles&Delfiner 1999, Cressie 1993):
The empirical variogram cannot be computed at every lag distance <math>h</math> and due to variation in the estimation it is not ensured that it is a valid variogram, as defined above. However some [[Earth:Geostatistics|geostatistical]] methods such as [[Kriging|kriging]] need valid semivariograms. In applied geostatistics the empirical variograms are thus often approximated by model function ensuring validity. Some important models are:<ref>{{Cite book |last=Cressie |first=Noel A. C. |url=https://onlinelibrary.wiley.com/doi/book/10.1002/9781119115151 |title=Statistics for Spatial Data |date=1993-09-10 |publisher=Wiley |isbn=978-0-471-00255-0 |edition=1 |series=Wiley Series in Probability and Statistics |language=en |doi=10.1002/9781119115151}}</ref><ref name=":0">{{Cite book |last1=Chilès |first1=Jean-Paul |url=https://onlinelibrary.wiley.com/doi/book/10.1002/9781118136188 |title=Geostatistics: Modeling Spatial Uncertainty |last2=Delfiner |first2=Pierre |date=2012-03-02 |publisher=Wiley |isbn=978-0-470-18315-1 |edition=1 |series=Wiley Series in Probability and Statistics |language=en |doi=10.1002/9781118136188}}</ref>


* The exponential variogram model
* The exponential variogram model
Line 80: Line 87:
*: <math>\gamma(h)=(s-n)\left(1-\exp\left(-\frac{h^2}{r^2a}\right)\right) + n1_{(0,\infty)}(h).</math>
*: <math>\gamma(h)=(s-n)\left(1-\exp\left(-\frac{h^2}{r^2a}\right)\right) + n1_{(0,\infty)}(h).</math>


The parameter <math>a</math> has different values in different references, due to the ambiguity in the definition of the range. E.g. <math>a=1/3</math> is the value used in (Chiles&Delfiner 1999). The <math>1_A(h)</math> function is 1 if <math>h\in A</math> and 0 otherwise.
The parameter <math>a</math> has different values in different references, due to the ambiguity in the definition of the range (e.g. <math>a=1/3</math>).<ref name=":0" /> The [[Indicator function|indicator function]] <math>1_A(h)</math> is 1 if <math>h\in A</math> and 0 otherwise.
 
==Discussion==
Three functions are used in [[Earth:Geostatistics|geostatistics]] for describing the spatial or the temporal correlation of observations: these are the [[Correlogram|correlogram]], the [[Covariance|covariance]] and the '''semivariogram'''. The last is also more simply called '''variogram'''.
 
The variogram is the key function in [[Earth:Geostatistics|geostatistics]] as it will be used to fit a model of the temporal/[[Spatial correlation|spatial correlation]] of the observed phenomenon. One is thus making a distinction between the ''experimental variogram'' that is a visualisation of a possible spatial/temporal correlation and the ''variogram model'' that is further used to define the weights of the [[Kriging|kriging]] function. Note that the experimental variogram is an empirical estimate of the [[Covariance|covariance]] of a [[Gaussian process]]. As such, it may not be positive definite and hence not directly usable in [[Kriging|kriging]], without constraints or further processing. This explains why only a limited number of variogram models are used: most commonly, the linear, the spherical, the Gaussian and the exponential models.


==Applications==
==Applications==
The '''empirical variogram''' is used in [[Earth:Geostatistics|geostatistics]] as a first estimate of the variogram model needed for spatial interpolation by [[Kriging|kriging]].  
The empirical variogram is used in [[Earth:Geostatistics|geostatistics]] as a first estimate of the variogram model needed for spatial interpolation by [[Kriging|kriging]].  


* Empirical variograms for the spatiotemporal variability of column-averaged [[Earth:Carbon dioxide in Earth's atmosphere|carbon dioxide]] was used to determine coincidence criteria for satellite and ground-based measurements.<ref name="Nguyen2014"/>
* Empirical variograms for the spatiotemporal variability of column-averaged [[Earth:Carbon dioxide in Earth's atmosphere|carbon dioxide]] was used to determine coincidence criteria for satellite and ground-based measurements.<ref name="Nguyen2014"/>
* Empirical variograms were calculated for the density of a heterogeneous material (Gilsocarbon).<ref name="arregui18">{{cite journal | last1 = Arregui Mena | first1 = J.D. | display-authors = etal  | year = 2018 | title = Characterisation of the spatial variability of material properties of Gilsocarbon and NBG-18 using random fields | url = https://www.researchgate.net/publication/327537624 | journal = Journal of Nuclear Materials | volume = 511 | pages = 91–108| doi = 10.1016/j.jnucmat.2018.09.008| bibcode = 2018JNuM..511...91A | doi-access = free }}</ref>
* Empirical variograms were calculated for the density of a heterogeneous material (Gilsocarbon).<ref name="arregui18">{{cite journal | last1 = Arregui Mena | first1 = J.D. | display-authors = etal  | year = 2018 | title = Characterisation of the spatial variability of material properties of Gilsocarbon and NBG-18 using random fields | url = https://www.researchgate.net/publication/327537624 | journal = Journal of Nuclear Materials | volume = 511 | pages = 91–108| doi = 10.1016/j.jnucmat.2018.09.008| bibcode = 2018JNuM..511...91A | osti = 1479781 | doi-access = free }}</ref>
*Empirical variograms are calculated from observations of [[Earth:Strong ground motion|strong ground motion]] from [[Earth:Earthquake|earthquake]]s.<ref>{{Cite journal|last1=Schiappapietra|first1=Erika|last2=Douglas|first2=John|date=April 2020|title=Modelling the spatial correlation of earthquake ground motion: Insights from the literature, data from the 2016–2017 Central Italy earthquake sequence and ground-motion simulations|journal=Earth-Science Reviews|language=en|volume=203|pages=103139|doi=10.1016/j.earscirev.2020.103139|bibcode=2020ESRv..20303139S|url=https://strathprints.strath.ac.uk/71570/}}</ref> These models are used for [[Earth:Seismic risk|seismic risk]] and loss assessments of spatially-distributed infrastructure.<ref>{{Cite journal|last1=Sokolov|first1=Vladimir|last2=Wenzel|first2=Friedemann|date=2011-07-25|title=Influence of spatial correlation of strong ground motion on uncertainty in earthquake loss estimation|journal=Earthquake Engineering & Structural Dynamics|language=en|volume=40|issue=9|pages=993–1009|doi=10.1002/eqe.1074}}</ref>
*Empirical variograms are calculated from observations of [[Earth:Strong ground motion|strong ground motion]] from [[Earth:Earthquake|earthquake]]s.<ref>{{Cite journal|last1=Schiappapietra|first1=Erika|last2=Douglas|first2=John|date=April 2020|title=Modelling the spatial correlation of earthquake ground motion: Insights from the literature, data from the 2016–2017 Central Italy earthquake sequence and ground-motion simulations|journal=Earth-Science Reviews|language=en|volume=203|article-number=103139|doi=10.1016/j.earscirev.2020.103139|bibcode=2020ESRv..20303139S|url=https://strathprints.strath.ac.uk/71570/}}</ref> These models are used for [[Earth:Seismic risk|seismic risk]] and loss assessments of spatially-distributed infrastructure.<ref>{{Cite journal|last1=Sokolov|first1=Vladimir|last2=Wenzel|first2=Friedemann|date=2011-07-25|title=Influence of spatial correlation of strong ground motion on uncertainty in earthquake loss estimation|journal=Earthquake Engineering & Structural Dynamics|language=en|volume=40|issue=9|pages=993–1009|doi=10.1002/eqe.1074 |bibcode=2011EESD...40..993S }}</ref>


==Related concepts==
==Related concepts==


The squared term in the variogram, for instance <math>(Z(\mathbf{s}_1) - Z(\mathbf{s}_2))^2</math>, can be replaced with different powers: A ''madogram'' is defined with the [[Absolute difference|absolute difference]], <math>|Z(\mathbf{s}_1) - Z(\mathbf{s}_2)|</math>, and a ''rodogram'' is defined with the [[Square root|square root]] of the absolute difference, <math>|Z(\mathbf{s}_1) - Z(\mathbf{s}_2)|^{0.5}</math>. [[Estimator]]s based on these lower powers are said to be more resistant to [[Outlier|outlier]]s. They can be generalized as a "variogram of order ''α''",
The squared term in the variogram, for instance <math>(Z(\mathbf{s}_1) - Z(\mathbf{s}_2))^2</math>, can be replaced with different powers: A ''madogram'' is defined with the [[Absolute difference|absolute difference]], <math>|Z(\mathbf{s}_1) - Z(\mathbf{s}_2)|</math>, and a ''rodogram'' is defined with the [[Square root|square root]] of the absolute difference, <math>|Z(\mathbf{s}_1) - Z(\mathbf{s}_2)|^{0.5}</math>. [[Estimator]]s based on these lower powers are said to be more [[Robust statistics|resistant]] to [[Outlier|outlier]]s. They can be generalized as a "variogram of order ''α''",


:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=E\left[\left|Z(\mathbf{s}_1)-Z(\mathbf{s}_2)\right|^\alpha\right]</math>,
:<math>2\gamma(\mathbf{s}_1,\mathbf{s}_2)=E\left[\left|Z(\mathbf{s}_1)-Z(\mathbf{s}_2)\right|^\alpha\right]</math>,


in which a variogram is of order 2, a madogram is a variogram of order 1, and a rodogram is a variogram of order 0.5.<ref>{{cite book |title=Geostatistical Glossary and Multilingual Dictionary |pages=47, 67, 81 |first=Ricardo A. |last=Olea |isbn=9780195066890 |publisher=Oxford University Press |date=1991}}</ref>
in which a variogram is of order 2, a madogram is a variogram of order 1, and a rodogram is a variogram of order 0.5.<ref>{{cite book |title=Geostatistical Glossary and Multilingual Dictionary |pages=47, 67, 81 |first=Ricardo A. |last=Olea |isbn=978-0-19-506689-0 |publisher=Oxford University Press |date=1991}}</ref>


When a variogram is used to describe the correlation of different variables it is called ''cross-variogram''. Cross-variograms are used in co-kriging.
When a variogram is used to describe the correlation of different variables it is called ''cross-variogram''. Cross-variograms are used in [[Kriging#Methods|co-kriging]].
Should the variable be binary or represent classes of values, one is then talking about ''indicator variograms''. Indicator variogram is used in indicator kriging.
Should the variable be binary or represent classes of values, one is then talking about ''indicator variograms''. Indicator variograms are used in [[Kriging#Methods|indicator kriging]].


==References==
==References==
Line 122: Line 124:
==External links==
==External links==
* [http://www.ai-geostats.org/ AI-GEOSTATS: an educational resource about geostatistics and spatial statistics]
* [http://www.ai-geostats.org/ AI-GEOSTATS: an educational resource about geostatistics and spatial statistics]
* [http://www.statistik.tuwien.ac.at/public/dutt/vorles/geost_05/geo.html Geostatistics: Lecture by Rudolf Dutter at the Technical University of Vienna]
* [http://www.statistik.tuwien.ac.at/public/dutt/vorles/geost_05/geo.html Geostatistics: Lecture by Rudolf Dutter at the Technical University of Vienna] {{Webarchive|url=https://web.archive.org/web/20071215015726/http://www.statistik.tuwien.ac.at/public/dutt/vorles/geost_05/geo.html |date=2007-12-15 }}
 


[[Category:Geostatistics]]
[[Category:Geostatistics]]

Latest revision as of 17:06, 24 May 2026

A variogram is the graphical representation of the spatial dependence between pairs of data points, commonly used in geostatistics and spatial statistics. The term is sometimes used synonymously with semivariogram, but the latter is also used by some authors to refer to half of a variogram, and should therefore be avoided.[1] Likewise, the term semivariance can be misleading, since the values shown in a variogram are entire variances of observations at a given spatial separation (lag).[1]

The variogram is the key function in geostatistics as it will be used to fit a model of the temporal/spatial correlation of the observed phenomenon. One is thus making a distinction between the experimental variogram that is a visualization of a possible spatial/temporal correlation and the variogram model that is further used to define the weights of the kriging function. Note that the experimental variogram is an empirical estimate of the covariance of a Gaussian process. As such, it may not be positive definite and hence not directly usable in kriging, without constraints or further processing. This explains why only a limited number of variogram models are used: most commonly, the linear, the spherical, the Gaussian, and the exponential models.

For example, in gold mining, a variogram will give a measure of how much two samples taken from the mining area will vary in gold percentage depending on the distance between those samples. Samples taken far apart will vary more than samples taken close to each other.

Definition

The semivariogram γ(h) was first defined by Matheron (1963) as half the average squared difference between a function and a translated copy of the function separated at distance h.[2][3] Formally

γ(h)=12V[f(M+h)f(M)]2dM,

where M is a point in the geometric field V, and f(M) is the value at that point. The triple integral is over 3 dimensions. h is the separation distance (e.g., in meters or km) of interest. For example, the value f(M) could represent the iron content in soil, at some location M (with geographic coordinates of latitude, longitude, and elevation) over some region V with element of volume dV. To obtain the semivariogram for a given γ(h), all pairs of points at that exact distance would be sampled. In practice it is impossible to sample everywhere, so the empirical variogram is used instead.

The variogram is twice the semivariogram and can be defined, differently, as the variance of the difference between field values at two locations (𝐬1 and 𝐬2, note change of notation from M to 𝐬 and f to Z) across realizations of the field (Cressie 1993):

2γ(𝐬1,𝐬2)=var(Z(𝐬1)Z(𝐬2))=E[((Z(𝐬1)Z(𝐬2))E[Z(𝐬1)Z(𝐬2)])2].

If the spatial random field has constant mean μ, this is equivalent to the expectation for the squared increment of the values between locations 𝐬1 and s2 (Wackernagel 2003) (where 𝐬1 and 𝐬2 are points in space and possibly time):

2γ(𝐬1,𝐬2)=E[(Z(𝐬1)Z(𝐬2))2].

In the case of a stationary process, the variogram and semivariogram can be represented as a function γs(h)=γ(0,0+h) of the difference h=𝐬2𝐬1 between locations only, by the following relation (Cressie 1993):

γ(𝐬1,𝐬2)=γs(𝐬2𝐬1).

If the process is furthermore isotropic, then the variogram and semivariogram can be represented by a function γi(h):=γs(he1) of the distance h=𝐬2𝐬1 only (Cressie 1993):

γ(𝐬1,𝐬2)=γi(h).

The indexes i or s are typically not written. The terms are used for all three forms of the function. Moreover, the term "variogram" is sometimes used to denote the semivariogram, and the symbol γ is sometimes used for the variogram, which brings some confusion.[4]

Properties

According to (Cressie 1993, Chiles and Delfiner 1999, Wackernagel 2003) the theoretical variogram has the following properties:

  • The semivariogram is nonnegative γ(𝐬1,𝐬2)0, since it is the expectation of a square.
  • The semivariogram γ(𝐬1,𝐬1)=γi(0)=E((Z(𝐬1)Z(𝐬1))2)=0 at distance 0 is always 0, since Z(𝐬1)Z(𝐬1)=0.
  • A function is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights w1,,wN subject to i=1Nwi=0 and locations s1,,sN it holds:
i=1Nj=1Nwiγ(𝐬i,𝐬j)wj0,

which corresponds to the fact that the variance var(X) of X=i=1NwiZ(xi) is given by the negative of this double sum and must be nonnegative.[disputed ]

2γ(𝐬1,𝐬2)=C(𝐬1,𝐬1)+C(𝐬2,𝐬2)2C(𝐬1,𝐬2)
γ(𝐬1,𝐬2)=V(1c(𝐬1,𝐬2))
  • Conversely, the covariance function C of a stationary process can be obtained from the semivariogram and variance as
C(𝐬1,𝐬2)=Vγ(𝐬1,𝐬2)
  • If a stationary random field has no spatial dependence (i.e. C(h)=0 if h=0), the semivariogram is the constant var(Z(𝐬)) everywhere except at the origin, where it is zero.
  • The semivariogram is a symmetric function, γ(𝐬1,𝐬2)=E[|Z(𝐬1)Z(𝐬2)|2]=γ(𝐬2,𝐬1).
  • Consequently, the isotropic semivariogram is an even function γs(h)=γs(h).
  • If the random field is stationary and ergodic, the limhγs(h)=var(Z(𝐬)) corresponds to the variance of the field. The limit of the semivariogram with increasing distance is also called its sill.
  • As a consequence the semivariogram might be non continuous only at the origin. The height of the jump at the origin is sometimes referred to as nugget or nugget effect.

Parameters

In summary, the following parameters are often used to describe variograms:

  • nugget n: The height of the jump of the semivariogram at the discontinuity at the origin.
  • sill s: Limit of the variogram tending to infinity lag distances.
  • range r: The distance in which the difference of the variogram from the sill becomes negligible. In models with a fixed sill, it is the distance at which this is first reached; for models with an asymptotic sill, it is conventionally taken to be the distance when the semivariance first reaches 95% of the sill.

Empirical variogram

Generally, an empirical variogram is needed for measured data, because sample information Z is not available for every location. The sample information for example could be concentration of iron in soil samples, or pixel intensity on a camera. Each piece of sample information has coordinates 𝐬=(x,y) for a 2D sample space where x and y are geographical coordinates. In the case of the iron in soil, the sample space could be 3 dimensional. If there is temporal variability as well (e.g., phosphorus content in a lake) then 𝐬 could be a 4 dimensional vector (x,y,z,t). For the case where dimensions have different units (e.g., distance and time) then a scaling factor B can be applied to each to obtain a modified Euclidean distance.[5]

Sample observations are denoted Z(𝐬i)=zi. Observations may be taken at M total different locations (the sample size). This would provide as set of observations z1,,zM at locations 𝐬1,,𝐬M. Generally, plots show the semivariogram values as a function of separation distance hk for multiple steps k=1,. In the case of empirical semivariogram, separation distance interval hk±δ is used rather than exact distances, and usually isotropic conditions are assumed (i.e., that γ is only a function of h and does not depend on other variables such as center position). Then, the empirical semivariogram γ^(h±δ) can be calculated for each bin:

γ^(hk±δ):=12Nk(i,j)Sk|zizj|2

Or in other words, each pair of points separated by hk (plus or minus some bin width tolerance range δ) are found. These form the set of points

Sk=S(hk±δ){(𝐬i,𝐬j):hkδ<|𝐬i𝐬j|<hk+δ;i,j=1,,M}

The number of these points in this bin is Nk=|Sk| (the set size). Then for each pair of points i,j, the square of the difference in the observation (e.g., soil sample content or pixel intensity) is found (|zizj|2). These squared differences are added together and normalized by the natural number Nk. By definition the result is divided by 2 for the semivariogram at this separation.

For computational speed, only the unique pairs of points are needed. For example, for 2 observations pairs [(za,zb),(zc,zd)] taken from locations with separation h±δ only [(za,zb),(zc,zd)] need to be considered, as the pairs [(zb,za),(zd,zc)] do not provide any additional information.

Variogram models

Typical semivariogram functions in kriging.[6]

The empirical variogram cannot be computed at every lag distance h and due to variation in the estimation it is not ensured that it is a valid variogram, as defined above. However some geostatistical methods such as kriging need valid semivariograms. In applied geostatistics the empirical variograms are thus often approximated by model function ensuring validity. Some important models are:[7][8]

  • The exponential variogram model
    γ(h)=(sn)(1exp(h/(ra)))+n1(0,)(h).
  • The spherical variogram model
    γ(h)=(sn)((3h2rh32r3)1(0,r)(h)+1[r,)(h))+n1(0,)(h).
  • The Gaussian variogram model
    γ(h)=(sn)(1exp(h2r2a))+n1(0,)(h).

The parameter a has different values in different references, due to the ambiguity in the definition of the range (e.g. a=1/3).[8] The indicator function 1A(h) is 1 if hA and 0 otherwise.

Applications

The empirical variogram is used in geostatistics as a first estimate of the variogram model needed for spatial interpolation by kriging.

  • Empirical variograms for the spatiotemporal variability of column-averaged carbon dioxide was used to determine coincidence criteria for satellite and ground-based measurements.[5]
  • Empirical variograms were calculated for the density of a heterogeneous material (Gilsocarbon).[9]
  • Empirical variograms are calculated from observations of strong ground motion from earthquakes.[10] These models are used for seismic risk and loss assessments of spatially-distributed infrastructure.[11]

The squared term in the variogram, for instance (Z(𝐬1)Z(𝐬2))2, can be replaced with different powers: A madogram is defined with the absolute difference, |Z(𝐬1)Z(𝐬2)|, and a rodogram is defined with the square root of the absolute difference, |Z(𝐬1)Z(𝐬2)|0.5. Estimators based on these lower powers are said to be more resistant to outliers. They can be generalized as a "variogram of order α",

2γ(𝐬1,𝐬2)=E[|Z(𝐬1)Z(𝐬2)|α],

in which a variogram is of order 2, a madogram is a variogram of order 1, and a rodogram is a variogram of order 0.5.[12]

When a variogram is used to describe the correlation of different variables it is called cross-variogram. Cross-variograms are used in co-kriging. Should the variable be binary or represent classes of values, one is then talking about indicator variograms. Indicator variograms are used in indicator kriging.

References

  1. 1.0 1.1 Bachmaier, Martin; Backes, Matthias (2011-08-30). "Variogram or Semivariogram? Variance or Semivariance? Allan Variance or Introducing a New Term?" (in en). Mathematical Geosciences 43 (6): 735–740. doi:10.1007/s11004-011-9348-3. ISSN 1874-8961. https://www.researchgate.net/publication/227307628_Variogram_or_Semivariogram_Variance_or_Semivariance_Allan_Variance_or_Introducing_a_New_Term. 
  2. Matheron, Georges (1963). "Principles of geostatistics". Economic Geology 58 (8): 1246–1266. doi:10.2113/gsecongeo.58.8.1246. ISSN 1554-0774. Bibcode1963EcGeo..58.1246M. 
  3. Ford, David. "The Empirical Variogram". http://www.faculty.washington.edu/edford/Variogram.pdf. 
  4. Bachmaier, Martin; Backes, Matthias (2008-02-24). "Variogram or semivariogram? Understanding the variances in a variogram". Precision Agriculture (Springer Science and Business Media LLC) 9 (3): 173–175. doi:10.1007/s11119-008-9056-2. ISSN 1385-2256. Bibcode2008PrAgr...9..173B. 
  5. 5.0 5.1 Nguyen, H.; Osterman, G.; Wunch, D.; O'Dell, C.; Mandrake, L.; Wennberg, P.; Fisher, B.; Castano, R. (2014). "A method for colocating satellite XCO2 data to ground-based data and its application to ACOS-GOSAT and TCCON". Atmospheric Measurement Techniques 7 (8): 2631–2644. doi:10.5194/amt-7-2631-2014. ISSN 1867-8548. Bibcode2014AMT.....7.2631N. 
  6. Ding, Qile; Wang, Yiren; Zheng, Yu; Wang, Fengyang; Zhou, Shudong; Pan, Donghui; Xiong, Yuchun; Zhang, Yi (2024-12-05). "Subsurface Geological Profile Interpolation Using a Fractional Kriging Method Enhanced by Random Forest Regression" (in en). Fractal and Fractional 8 (12): 717. doi:10.3390/fractalfract8120717. ISSN 2504-3110. 
  7. Cressie, Noel A. C. (1993-09-10) (in en). Statistics for Spatial Data. Wiley Series in Probability and Statistics (1 ed.). Wiley. doi:10.1002/9781119115151. ISBN 978-0-471-00255-0. https://onlinelibrary.wiley.com/doi/book/10.1002/9781119115151. 
  8. 8.0 8.1 Chilès, Jean-Paul; Delfiner, Pierre (2012-03-02) (in en). Geostatistics: Modeling Spatial Uncertainty. Wiley Series in Probability and Statistics (1 ed.). Wiley. doi:10.1002/9781118136188. ISBN 978-0-470-18315-1. https://onlinelibrary.wiley.com/doi/book/10.1002/9781118136188. 
  9. Arregui Mena, J.D. (2018). "Characterisation of the spatial variability of material properties of Gilsocarbon and NBG-18 using random fields". Journal of Nuclear Materials 511: 91–108. doi:10.1016/j.jnucmat.2018.09.008. Bibcode2018JNuM..511...91A. https://www.researchgate.net/publication/327537624. 
  10. Schiappapietra, Erika; Douglas, John (April 2020). "Modelling the spatial correlation of earthquake ground motion: Insights from the literature, data from the 2016–2017 Central Italy earthquake sequence and ground-motion simulations" (in en). Earth-Science Reviews 203. doi:10.1016/j.earscirev.2020.103139. Bibcode2020ESRv..20303139S. https://strathprints.strath.ac.uk/71570/. 
  11. Sokolov, Vladimir; Wenzel, Friedemann (2011-07-25). "Influence of spatial correlation of strong ground motion on uncertainty in earthquake loss estimation" (in en). Earthquake Engineering & Structural Dynamics 40 (9): 993–1009. doi:10.1002/eqe.1074. Bibcode2011EESD...40..993S. 
  12. Olea, Ricardo A. (1991). Geostatistical Glossary and Multilingual Dictionary. Oxford University Press. pp. 47, 67, 81. ISBN 978-0-19-506689-0. 

Further reading