Spatial Statistics (I): Random Field

Definition and Properties

Definition. Given a probability space $(\Omega, \mathcal{F},\mathbb{P})$, a random field is a family or collection of random variables indexed by elements in a topological space $\mathcal{T}$. That is, a random field $Z(\cdot)$ is a collection of random variables

\[\left\lbrace Z(s): \Omega\to\mathbb{R} \ \vert\ s\in\mathcal{T}\right\rbrace,\]

where each $Z(s)$ is a random variable indexed by $s$.

Some examples are shown below.

Time Series: $X(t),\ t\in\mathbb{Z},$ like random walk;

Stochastic Process: $X(t),\ t\in\mathbb{R}$, like Poisson process, Brownian process, etc.;

Spatial Process: $Z(s),\ s\in\mathcal{D}\subseteq \mathbb{R}^d,\ d\geq 2$;

Spatio-temporal Process: $Z(s,t),\ s\in\mathcal{D},\ t\in\mathbb{R}$.

Mean and Covariance

For a random field $Z(\cdot)$ defined on $\mathcal{D}\subseteq \mathbb{R}^d$, the mean function is defined as a function $\mu_Z:\mathcal{D}\to\mathbb{R}$, given by

\[\mu_Z(s) := \mathbb{E}[Z(s)];\]

the covariance function $K_Z:\mathcal{D}\times\mathcal{D}\to\mathbb{R}$ is defined as

\[K_Z(s_1,s_2) := \mathrm{Cov}\left(Z(s_1),Z(s_2)\right) = \mathbb{E}\left[\left(Z(s_1)-\mu_Z(s_1)\right)\left(Z(s_2)-\mu_Z(s_2)\right)\right].\]

By definition, the covariance function is symmetric. Furthermore, it is positive definite, i.e., $\forall n\in\mathbb{N}^*,\ \forall s_1,\cdots,s_n\in\mathcal{D}$, the Gram matrix defined by $\mathbf{K} = \lbrace K_Z(s_i,s_j)\rbrace_{i,j=1}^n$ is positive semidefinite.

Properties of Covariance Function

Suppose that $K(\cdot,\cdot)$ is a valid covariance function defined on $\mathcal{D}\times\mathcal{D}$, where $\mathcal{D}\subseteq\mathbb{R}^d$.

Definition (Stationarity). A covariance function $K$ is called stationary if $\forall s_1,s_2\in\mathcal{D}$, $K(s_1,s_2)$ only depends on $s_1-s_2$. That is, there exists some $K_1:\mathbb{R}^d\to\mathbb{R}$ such that

\[K(s_1,s_2) = K_1(s_1-s_2)\ \ \forall s_1,s_2\in\mathcal{D}.\]

A stationary covariance function is invariant to translation.

Definition (Isotropy). A covariance function $K$ is called isotropic if $\forall s_1,s_2\in\mathcal{D}$, $K(s_1,s_2)$ only depends on $\Vert s_1-s_2\Vert$, where $\Vert\cdot\Vert$ denotes $L_2$-norm. That is, there exists some $K_2:\mathbb{R}^*\to\mathbb{R}$ such that

\[K(s_1,s_2) = K_2(\Vert s_1-s_2\Vert)\ \ \forall s_1,s_2\in\mathcal{D}.\]

Definition (Dot product). A covariance function $K$ has rotational invariance if $\forall s_1,s_2\in\mathcal{D}$, $K(s_1,s_2)$ only depends on their dot product $s_1\cdot s_2$. That is, there exists some $K_3:\mathbb{R}\to\mathbb{R}$ such that

\[K(s_1,s_2) = K_3(s_1\cdot s_2)\ \ \forall s_1,s_2\in\mathcal{D}.\]

Staionarity and Isotropy of Random Field

Definition. Given a probability space $(\Omega, \mathcal{F},\mathbb{P})$. A random field $Z(\cdot)$ is called stationary (isotropic) if it satisfies the following 3 conditions:

$Z(\cdot)$ has constant mean, that is, $\mu_Z$ is a constant;
$Z(\cdot)$ has finite 2nd moment, that is, $Z(\cdot)\subseteq L_2(\Omega)$;
$Z(\cdot)$ has stationary (isotropic) covariance function.

A random field $Z’(\cdot)$ is called anisotropic if it is not isotropic.

Definition (Geometric anisotropy). A random field $Z(\cdot)$ is called geometrically anisotropic if it is isotropic after a linear transformation, i.e. there exists a non-singular matrix $V\in\mathbb{R}^{d\times d}$ such that $Z’: s \mapsto Z(Vs)$ is isotropic.

Variogram

Definition. Given a random field $Z(\cdot)$ defined on $\mathcal{D}\subseteq\mathbb{R}^d$, the variogram of $Z(\cdot)$ is defined as the variance of the difference between field values at two locations:

\[2\gamma(s_1,s_2) := \mathrm{Var}\lbrace Z(s_1) - Z(s_2)\rbrace,\]

where $\gamma$ is called the semivariogram.

A stationary variogram and semivariogram can be represented as a function of the difference $h=s_1-s_2$ between locations only:

\[2\gamma(h) := \mathrm{Var}\lbrace Z(s + h) - Z(s)\rbrace.\]

In the case of stationary random field, we have

\[2\gamma(h) = 2K_Z(0) - 2K_Z(h),\]

where $K_Z$ is the covariance function of stationary field $Z(\cdot)$. Hence, a stationary random field has a stationary variogram.

Analogously, an isotropic variogram and semivariogram can be represented as a function of the distance $\Vert h\Vert = \Vert s_1-s_2\Vert$ between locations only:

\[2\gamma(\Vert h\Vert) := \mathrm{Var}\lbrace Z(s + h) - Z(s)\rbrace.\]

In general, a variogram gives a description of the spatial continuity of our data.

Properties

Property 1. The semivariogram is nonnegative and symmetric, that is, $\gamma(s_1,s_2)=\gamma(s_2,s_1) \geq 0\ \forall s_1,s_2\in\mathcal{D}$.

Property 2. The (isotropic) semivariogram at distance 0 is always 0, since $Z(s) - Z(s) \equiv 0$.

Property 3. A function $\gamma$ is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights $w_1,\cdots,w_n$ subjected to $w_1+\cdots + w_n = 0$ and locations $s_1,\cdots,s_n \in\mathcal{D}$, it holds

\[\sum_{i=1}^n\sum_{j=1}^n w_iw_j\gamma(s_i,s_j) \leq 0.\]

Proof. Consider a random vector $\mathbf{Z} = \left(Z(s_1),\cdots,Z(s_n)\right)^\top$ with mean $\mu=\mathbb{E}\mathbf{Z}$ and covariance matrix $\mathbf{K}=\lbrace K(s_i,s_j)\rbrace_{i,j=1}^n$. Define the semivariogram matrix of $\mathbf{Z}$ to be $\Gamma = \lbrace\gamma(s_i,s_j)\rbrace_{i,j=1}^n$. By definition,

\[\Gamma = \frac{1}{2}\mathrm{vdiag}(\mathbf{K})\mathbf{1}_n^\top + \frac{1}{2}\mathbf{1}_n\mathrm{vdiag}(\mathbf{K})^\top - \mathbf{K}\]

where $\mathrm{vdiag}(\mathbf{K}) = (\mathbf{K}_{11},\cdots,\mathbf{K} _{nn})^\top$ and $\mathbf{1}_n$ is an $n$-dimensional all-one vector. All diagonal entries of $\Gamma$ are zero.

Hence for any $\mathbf{w}\in\mathbb{R}^n$ such that $\mathbf{1}^n\mathbf{w}=0$,

\[\mathbf{w}^\top\Gamma\mathbf{w} = -\mathbf{w}^\top\mathbf{Kw} \leq 0,\]

which implies that $\gamma$ is negative definite.

Nugget Effect

Definition. The nugget effect is the variation of the process at a finer scale than the smallest distance measured:

\[c_0 := \lim_{h\to 0}\gamma(h).\]

It can be caused by random noise or measurement errors, and is shown graphically in the variogram plot as a discontinuity at the origin of the function.

avatar

Empirical Variogram

Suppose that we have observations $z(s_1),\cdots,z(s_n)$, an empirical semivariogram is given by

\[\hat{\gamma}(h) = \frac{1}{2\vert\mathcal{N}(h)\vert}\sum_{(i,j)\in\mathcal{N}(h)}\left\lbrace z(s_i) - z(s_j)\right\rbrace^2,\]

where $\mathcal{N}(h)$ is a set of observation pairs whose distance is close to $h$.

Blog Archive

Archive of all previous blog posts

Spatial Statistics (II): Kernel Functions