Geostatistics: introductory concepts


In-depth Articles

The stochastic approach

Assumptions about the phenomenon under examination are incorporated in a random process: \[Y=\{Y(x):x \in A\}\] where A is the study region.

It is assumed that the process \(Y\) is observable. To model \(Y\), components representing relevant characteristics are identified:

\[Y(x) = \mu(x) + S(x)+ W(x)\]

Where:

\(\mu(x)\): large scale (trend)

\(S(x)\): small scale: structured stochastic component (signal), this component is itself a random object, i.e., a stochastic process

\(W(X)\): disturbance component (noise)

With S and W independent of each other

The goal of inference is the signal, S, or the observable process Y

The \(n\) measurements of \(Y\) are observed values of a trajectory of \(Y\) at the \(n\) predetermined sites.

\[Y(x_1), ..., Y(x_n)\]

Spatial stochastic processes

A spatial stochastic process (SSP) is a collection of random variables defined on the same probability space \((\Omega,F,P)\), indexed by a parameter that varies in a region of \(R^d\) (for example in two dimensions \(R^2 \rightarrow\) latitude and longitude).

\[S=\{ S(x): x \in A\} \\ A \subseteq R^d \\ x:index\]

Usually in Geostatistics \(d=2\) and \(A\) is a bounded region of the plane.



A spatial stochastic process is called Gaussian if for every i-th \(x_i\) it holds that: \[S(x_1),...,S(x_k) \sim N_k(\underline \mu , \Sigma)\\s.t. \\k \geq 1 \\E[S(x)]=\mu(x) \space \space \space \forall x \in A\\ Cov(S(x_i),S(x_j))= K(x_i,x_j) \space \space \space \forall x_{i,j} \in A\]

Stationarity and non-Stationarity

Stationarity: Spatial dependence is independent of the location where the variable is measured. Two measurements made at the same spatial lag \(h = s − u\) tend to "resemble each other" regardless of the position \(s\) and \(u\).

Non-Stationarity: The spatial dependence of \(S\) depends on where it is measured

Stationarity is an important property for practical purposes: it allows us to make inference

Stationarity can be identified as:

  • Strong stationarity: a process is strongly stationary if by increasing by \(v\) (\(v \in R^d \space s.t. \space x_i +v \in A \subseteq R^d\)) each i-th component of \(x\) within the process \(S\), its distribution does not change.\[S(x_1),...,S(x_k) \sim S(x_1+v),...,S(x_k+v) \\k \geq 1 , \space \forall(x_1,..x_k), \space x_i \in A\subseteq R^d \\ \forall v \in R^d \space s.t. \space x_i +v \in A\]
  • Weak stationarity: a process is weakly stationary if the incremental process has zero mean and if the covariance between a pair of random variables depends on how far apart they are.\[E[S(x)] = E[S(x+v)] = \mu \space \space \space \space \space \forall v \in R^d \space s.t. \space x_i +v \in A \\ Cov(S(x_1),S(x_2))=Cov(S(x_1+v),S(x_2+v)) \space \space \space \space \space \forall v \in R^d \space s.t. \space x_i +v \in A, \space i=1,2\]

Strong stationarity implies weak stationarity; if the process is Gaussian, the two notions are equivalent.

Local stationarity In practical situations, the assumption of stationarity over the entire study area is difficult to verify.

By local stationarity we mean that the stationarity conditions (second-order or intrinsic) hold for \(|h| ≤ b (b > 0)\). In many real applications, local stationarity is reasonable and often sufficient for analysis purposes (e.g., in local predictions)


Covariance function or covariogram

Assuming \(h\) as the difference between two vectors \(x\) and \(y\)

\[C(h)= Cov(S(x),S(x+h))\]

with \(x, x + h \in A\), that is, the covariance between \(S(x)\) and \(S(y)\), \(y = x + h\) depends only on \(h = y - h\) (in space-time processes or even just in temporal processes this rule does not hold since the process also varies over time)

Properties:
  • Homoscedasticity:\[C(0)= Var(S(x))\geq |C(h)|\]
  • Symmetry:\[C(h) = C(-h)\]
  • The function \(C(h))\) is positive definite

Correlation function or correlogram

\[ \rho(h) = Corr(S(x),S(x+h)) \space \space \space \space x, x + h \in A\]

Relationship between correlogram and covariogram: \(\rho(h)=\frac{C(h)}{C(0)}\)

Variogram

A different approach to describe spatial dependence is represented by the variogram. It is not a measure of correlation but indicates a measure of variability of our process:

\[2 \gamma(h)= Var[S(x+h)-S(x)] \\ \gamma(h) = semi\space variogram\]

Moreover, if \(E[S(x)]=\mu\) \(\forall x \in A\) then \(2 \gamma(h)= E[(S(x+h)-S(x))^2]\)

The variogram describes how the dissimilarity between S(x) and S(x+h) changes as the vector h varies, which represents the spatial lag between two variables of the random process and, therefore, expresses its degree of spatial regularity. An SSP that admits a variogram is called intrinsically stationary.

Processi intrinsecamenti stazionari

Un processo stocastico spaziale \(S\) è intrinsecamente stazionari (anche denominati processi intrinsecamente stazionari del primo ordine) se ammette variogramma \[2\gamma(h)=Var(S(x+h)-S(x))\]

Proprietà del semi-variogramma

  • \(\gamma(h) = \gamma(-h) \rightarrow\) è una funzione pari
  • \(\gamma(h)\geq0\) con \(\gamma(0)=0\space \rightarrow\) è una funzione non negativa
  • Funzione condizionalmente definita negativa \[\forall x_1,...,x_k \space \space \space \& \space \space \space \forall a_1,...,a_k \space \space \space t.c. \space \space \space \sum_{i=1}^k a_i =0 \\ \sum_i \sum_j a_i \space a_j \space\gamma(x_i-x_j)\leq 0\]
  • Se \(S\) è un PSS debolmente stazionario allora è anche intrinsicamente stazionario

Isotropy

A further invariance property of a spatial stochastic process widely used in statistical applications is isotropy, which requires the invariance of the first two moments of the process with respect to rotations and compressions, as well as translations.

In fact, spatial dependence does not depend on direction but only on the distance between sites.

Isotropy refers to the property of bodies to have the same physical characteristics in all directions.

Formally, a stationary SSP is called isotropic if:

\[\forall x,y \in A \space \space Cov(S(x),S(y))=C(y-x)=C^0(|y-x|) \]

that is, if the covariance between the process values at two arbitrary locations depends only on the Euclidean distance between them. This means that the dependence structure (measured in terms of covariance) of the process value at a given point with respect to the values of the surrounding region is the same regardless of the direction in which one moves away from the given point.

Explanation in non-formal terms:

In general, when talking about an isotropic or anisotropic process, one should think about throwing a stone into a calm lake: if the stone has a regular shape, it will create waves that start from the point where it comes into contact with the water and gradually disperse in all directions with the same intensity; in this case the process is defined as isotropic. If instead an irregularly shaped stone is thrown, it will create waves that disperse in various directions with different intensity.