Kriging


In-depth Articles

Spatial Prediction

The objective of spatial analysis is to predict unobserved values of a measurable quantity at predefined spatial locations starting from its sample measurements collected at a finite number of sites \(s_1,...,s_n\).

It is assumed that the phenomenon is defined in a study region \(D\) and can be represented by spatial coordinates \(s\).

\[ Y(s):s \in D \]

Thus \(Y(s_1),...,Y(s_n)\) are the observed values at a set of known sites (\(s_1,...s_n\)), usually \(s_i\) corresponds to an ordered pair (abscissa, ordinate) for example (longitude, latitude).

The prediction can be point-wise (or interval-based), binary with respect to reference thresholds, and over areas in the study region.

There are two approaches for spatial prediction:

In stochastic prediction, the existence of a probabilistic model is assumed, while in non-stochastic prediction, the existence of such an underlying model is not assumed. In this article we will only deal with the stochastic approach.

Kriging

Kriging is a method used for spatial prediction introduced by Matheron and Gandin in the early 1960s, formalizing previous studies conducted by Krige (1951).

The main objective of this method is to obtain a map of the process.

In the class of linear predictors, Kriging is the one that minimizes the mean squared prediction error and, for this reason, is considered an optimal (linear) prediction method. Kriging can be classified into three types depending on the assumptions about the mean function of the spatial process that defines the regionalized variable. In particular, if the trend is known, the Kriging used is called simple; if it is unknown but constant, the prediction method is called ordinary Kriging; while if it is a function of locations (therefore unknown and non-constant), it is called universal Kriging.

Kriging is a regression method used in the field of spatial analysis (Geostatistics) that allows interpolation of a quantity in space, minimizing the mean squared error.

A strength of this type of interpolation is that it not only generates an interpolated spatial model, but also generates an estimate of the uncertainty of each point in that model. Unlike linear regression or inverse distance weighted interpolation, Kriging interpolation is primarily based on empirical observations, the observed sample data points, rather than on a pre-assumed model. The interpolation assigns greater weight to sample points in the vicinity of a location compared to more distant ones and, in order to reduce sampling bias, weights clusters less heavily than individual points. The value of each point is calculated in such a way as to minimize the predicted error for that particular point.

Let us assume

\[Y(s)=\mu(s)+Z(s)\] where:

Assume that the process has a variogram equal to \(2\gamma(h)\) (and consequently a semivariogram \(\gamma(h)\)).

For simplicity of notation in this article, we will assume that \(H\) is the matrix of distances between all coordinates \(s\) and that \(\Gamma\) is the semivariogram calculated on all distances of \(H\).

We want to predict the value of the regionalized variable at \(s_0\) given some observations \(Y(s_1),..., Y(s_n)\).

In Ordinary Kriging (OK) to obtain the coefficients for coordinates \(s_0\) we calculate:

\[\Gamma_0 = \left(\begin{array}{cc} \Gamma & 1\\ 1 & 0 \end{array}\right) \]

\[\gamma_0 = \left(\begin{array}{cc} \gamma(||s_1-s_0||) \\ \gamma(||s_1-s_0||)\\ ...\\ \gamma(||s_n-s_n||)\\ 1 \end{array}\right)\]

Finally, we estimate the coefficients:

\[ \alpha_0=\Gamma_0^{^{-1}} \gamma_0 \]

and we find \[\hat Y(s_0)=\sum_{i=1}^n \alpha_{0 i}\space\space s_{0 i}\]

In Simple Kriging (SK) it is assumed that the stochastic process is an SSP with mean equal to known \(\mu(s)\), so the same reasoning applied for Ordinary Kriging is used but on \(Z(s)=Y(s)-\mu(s)\) and when predictions need to be made, the known large-scale component is added to the estimate of the small-scale component.

In Universal Kriging (UK) it is necessary to first estimate \(\mu(s)\) with a polynomial regression of degree \(p\)

\[Y(s)=f_1(s)+...f_p(s)+Z(s)\]

we create the matrix \(X\) by calculating for each i-th coordinate and for each j-th degree of the polynomial, the function \(f_{j-1-n}(s_i)\)

\[\Gamma_0 = \left(\begin{array}{cc} \Gamma & X\\ X^T & 0 \end{array}\right) \]

\[\gamma_0 = \left(\begin{array}{cc} \gamma(||s_1-s_0||) \\ \gamma(||s_1-s_0||)\\ ...\\ \gamma(||s_n-s_n||)\\ 1\\ f_1(s_0)\\ ...\\ f_p(s_0) \end{array}\right)\]

Finally, we estimate the coefficients:

\[ \alpha_0=\Gamma_0^{^{-1}} \gamma_0 \]

and we find \[\hat Y(s_0)=\sum_{i=1}^n (\alpha_{0 i}\space\space \gamma(s_{0 i}-s_i))+\sum_{i=n+1}^{n+p+1} (\alpha_{0 i}\space\space f_{i-n-1}( s_{0 i}))\]

# Import data
require(geoR)
## Loading required package: geoR
## --------------------------------------------------------------
##  Analysis of Geostatistical Data
##  For an Introduction to geoR go to http://www.leg.ufpr.br/geoR
##  geoR version 1.7-5.2.1 (built on 2016-05-02) is now loaded
## --------------------------------------------------------------
dat.om<-soja98; str(dat.om)
## 'data.frame':    256 obs. of  10 variables:
##  $ X    : num  5.6 15.2 24.8 34.4 44 53.6 63.2 72.8 82.4 92 ...
##  $ Y    : num  3.6 1.6 4.6 5.6 1.6 1.6 2.6 4.6 3.6 2.6 ...
##  $ P    : num  4.4 3.7 3.5 3.9 3.6 3.7 2.9 2.2 3 2.7 ...
##  $ PH   : num  6.3 6.3 5.4 5.2 5.6 5.2 5.2 5.6 5.7 5.4 ...
##  $ K    : num  0.53 0.38 0.47 0.48 0.52 0.45 0.47 0.42 0.36 0.35 ...
##  $ MO   : num  47.5 48.7 51.8 57.9 59.7 ...
##  $ SB   : num  80.6 82.2 61.5 59.1 71.6 ...
##  $ iCone: num  31 29.5 23 24.5 18.5 21.5 13 23 20 20 ...
##  $ Rend : num  6.4 7.05 5.88 8.52 7.47 ...
##  $ PROD : num  2.56 2.82 2.35 3.41 2.99 3.73 2.91 3.47 3 3.58 ...
obj <- dat.om[,c("X","Y","MO")]
summary(obj)
##        X                Y                MO       
##  Min.   :  1.60   Min.   :  1.60   Min.   :36.55  
##  1st Qu.: 38.60   1st Qu.: 29.60   1st Qu.:48.07  
##  Median : 75.60   Median : 57.60   Median :53.61  
##  Mean   : 75.85   Mean   : 57.29   Mean   :52.55  
##  3rd Qu.:112.60   3rd Qu.: 85.60   3rd Qu.:56.97  
##  Max.   :149.60   Max.   :113.60   Max.   :68.35
attach(obj)



# Estimate variogram with WLS
soja.geo <- as.geodata(obj,coords.col=1:2,data.col=3)
soja.var <- variog(soja.geo, estimator.type="classical")
## variog: computing omnidirectional variogram
plot(soja.var, main="empirical variogram")     
soja.var.fit <-variofit(soja.var,ini.cov.pars=c(25,35),weights="cressie", 
                        cov.model="exponential", fix.nugget=FALSE,nugget=20)
## variofit: covariance model used is exponential 
## variofit: weights used: cressie 
## variofit: minimisation function used: optim
lines(soja.var.fit);    soja.var.fit

## variofit: model parameters estimated by WLS (weighted least squares):
## covariance model is: exponential
## parameter estimates:
##   tausq sigmasq     phi 
## 14.8772 37.1710 56.4090 
## Practical Range with cor=0.05 for asymptotic range: 168.9862
## 
## variofit: minimised weighted sum of squares = 118.3311
a<-soja.var.fit$cov.pars
size=30
YY<-round(seq(ceiling(min(Y)),floor(max(Y)), length=size),2)
XX<-round(seq(ceiling(min(X)),floor(max(X)), length=size),2)
griglia<-expand.grid(X=XX,Y=YY)
dim(griglia)
## [1] 900   2

Kriging ordinario

krg.or <- krige.conv(geodata=soja.geo, loc=griglia,
                     krige=krige.control(cov.pars=a,
                                         cov.model="exponential",
                                         nugget=soja.var.fit$nugget))
## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood
par(cex=0.5,mfrow=c(2,2))
# 'altosinistra'
image(krg.or,xlim=c(-15,floor(max(X))),ylim=c(0,120))
legend.krige(x.leg=c(-13,-8), y.leg=c(30,100), val=krg.or$predict,
                 vert=TRUE, off=0.7, cex=0.9, scale.vals=pretty(krg.or$predict))
contour(krg.or,add=T,coords.data=soja.geo$coords,cex=0.3)
# 'altodestra'
persp(krg.or,xlab="longitudine",ylab="latitudine",zlab="log-catch",ticktype="detailed")
# 'bassosinistra'
image(krg.or, val=sqrt(krg.or$krige.var), main="Kriging std. errors", xlim=c(min(X),floor(max(X))))
legend.krige(x.leg=c(-13,-8), y.leg=c(30,100), val=krg.or$krige.var,vert=TRUE, off=0.7, cex=0.9, scale.vals=pretty(krg.or$krige.var))
# 'bassodestra'
points.geodata(soja.geo,pt.divide="data.proportional")

par(cex=1,mfrow=c(1,1))

Kriging universale

obj.var.u <- variog(soja.geo, estimator.type="classical", trend="2nd")  
## variog: computing omnidirectional variogram
obj.var.fit.u <- variofit(obj.var.u,ini.cov.pars=c(10,40),  cov.model="exponential", fix.nugget=FALSE,nugget=15)
## variofit: covariance model used is exponential 
## variofit: weights used: npairs 
## variofit: minimisation function used: optim
plot(obj.var.u)
lines(obj.var.fit.u)

krg.uk <- krige.conv(geodata=soja.geo, 
                     loc=griglia,
                     krige=krige.control(type.krige = "ok",
                                         trend.d="2nd",
                                         trend.l="2nd",
                                         cov.pars=obj.var.fit.u$cov.pars,
                                         nugget=obj.var.fit.u$nugget,
                                         cov.model="exponential"))
## krige.conv: model with mean given by a 2nd order polynomial on the coordinates
## krige.conv: Kriging performed using global neighbourhood
par(cex=0.5,mfrow=c(2,2))
# 'altosinistra'
image(krg.uk, main="Kriging universale",xlim=c(-15,floor(max(X))),ylim=c(0,120))
legend.krige(x.leg=c(-13,-8), y.leg=c(30,100), val=krg.uk$predict,
             vert=TRUE, off=0.7, cex=0.9, scale.vals=pretty(krg.uk$predict))
contour(krg.uk,add=T)
# 'altodestra'
persp(krg.uk,xlab="longitudine",ylab="latitudine",zlab="log-catch",ticktype="detailed")
# 'bassosinistra'
image(krg.uk, val=sqrt(krg.uk$krige.var), main="Kriging std. errors", xlim=c(min(X),floor(max(X))))
legend.krige(x.leg=c(-13,-8), y.leg=c(30,100), val=krg.uk$krige.var,vert=TRUE, off=0.7, cex=0.9, scale.vals=pretty(krg.uk$krige.var))
# 'bassodestra'
points.geodata(soja.geo,pt.divide="data.proportional")

par(cex=1,mfrow=c(1,1))

Risultati

krg.uk$beta     
##         beta0         beta1         beta2         beta3         beta4 
##  4.998546e+01 -3.453553e-02  2.682163e-02 -9.281561e-06  7.409176e-05 
##         beta5 
##  7.958683e-04
krg.or$beta
##     beta 
## 52.76726