Long-range dependence

From HandWiki
Short description: Phenomenon in linguistics and data analysis

Long-range dependence (LRD), also called long memory or long-range persistence, is a phenomenon that may arise in the analysis of spatial or time series data. It relates to the rate of decay of statistical dependence of two points with increasing time interval or spatial distance between the points. A phenomenon is usually considered to have long-range dependence if the dependence decays more slowly than an exponential decay, typically a power-like decay. LRD is often related to self-similar processes or fields. LRD has been used in various fields such as internet traffic modelling, econometrics, hydrology, linguistics and the earth sciences. Different mathematical definitions of LRD are used for different contexts and purposes.[1][2][3][4][5][6]

Short-range dependence versus long-range dependence

One way of characterising long-range and short-range dependent stationary process is in terms of their autocovariance functions. For a short-range dependent process, the coupling between values at different times decreases rapidly as the time difference increases. Either the autocovariance drops to zero after a certain time-lag, or it eventually has an exponential decay. In the case of LRD, there is much stronger coupling. The decay of the autocovariance function is power-like and so is slower than exponential.

A second way of characterizing long- and short-range dependence is in terms of the variance of partial sum of consecutive values. For short-range dependence, the variance grows typically proportionally to the number of terms. As for LRD, the variance of the partial sum increases more rapidly which is often a power function with the exponent greater than 1. A way of examining this behavior uses the rescaled range. This aspect of long-range dependence is important in the design of dams on rivers for water resources, where the summations correspond to the total inflow to the dam over an extended period.[7]

The above two ways are mathematically related to each other, but they are not the only ways to define LRD. In the case where the autocovariance of the process does not exist (heavy tails), one has to find other ways to define what LRD means, and this is often done with the help of self-similar processes.

The Hurst parameter H is a measure of the extent of long-range dependence in a time series (while it has another meaning in the context of self-similar processes). H takes on values from 0 to 1. A value of 0.5 indicates the absence of long-range dependence.[8] The closer H is to 1, the greater the degree of persistence or long-range dependence. H less than 0.5 corresponds to anti-persistency, which as the opposite of LRD indicates strong negative correlation so that the process fluctuates violently.

Estimation of the Hurst Parameter

Slowly decaying variances, LRD, and a spectral density obeying a power-law are different manifestations of the property of the underlying covariance of a stationary process X. Therefore, it is possible to approach the problem of estimating the Hurst parameter from three difference angles:

  • Variance-time plot: based on the analysis of the variances of the aggregate processes
  • R/S statistics: based on the time-domain analysis of the rescaled adjusted range
  • Periodogram: based on a frequency-domain analysis

Relation to self-similar processes

Given a stationary LRD sequence, the partial sum if viewed as a process indexed by the number of terms after a proper scaling, is a self-similar process with stationary increments asymptotically, the most typical one being fractional Brownian motion. In the converse, given a self-similar process with stationary increments with Hurst index H > 0.5, its increments (consecutive differences of the process) is a stationary LRD sequence.

This also holds true if the sequence is short-range dependent, but in this case the self-similar process resulting from the partial sum can only be Brownian motion (H = 0.5).

Models

Among stochastic models that are used for long-range dependence, some popular ones are autoregressive fractionally integrated moving average models, which are defined for discrete-time processes, while continuous-time models might start from fractional Brownian motion.

See also

Notes

  1. Beran, Jan (1994). Statistics for Long-Memory Processes. CRC Press. 
  2. Doukhan (2003). Theory and Applications of Long-Range Dependence. Birkhäuser. 
  3. Malamud, Bruce D.; Turcotte, Donald L. (1999). Self-Affine Time Series: I. Generation and Analyses. 40. 1–90. doi:10.1016/S0065-2687(08)60293-9. ISBN 9780120188406. Bibcode1999AdGeo..40....1M. 
  4. Samorodnitsky, Gennady (2007). Long range dependence. Foundations and Trends in Stochastic Systems. 
  5. Beran (2013). Long memory processes: probabilistic properties and statistical methods. Springer. 
  6. Witt, Annette; Malamud, Bruce D. (September 2013). "Quantification of Long-Range Persistence in Geophysical Time Series: Conventional and Benchmark-Based Improvement Techniques". Surveys in Geophysics 34 (5): 541–651. doi:10.1007/s10712-012-9217-8. Bibcode2013SGeo...34..541W. 
  7. *Hurst, H.E., Black, R.P., Simaika, Y.M. (1965) Long-term storage: an experimental study Constable, London.
  8. Beran (1994) page 34

Further reading