Using Kriging in 3D Analyst


The Inverse Distance Weighted (IDW) and Spline methods are referred to as deterministic interpolation methods because they are directly based on the surrounding measured values or on specified mathematical formulas that determine the smoothness of the resulting surface. A second family of interpolation methods consists of geostatistical methods (such as Kriging), which are based on statistical models that include autocorrelation (the statistical relationship among the measured points). Because of this, not only do geostatistical techniques have the capability of producing a prediction surface, but they can also provide some measure of the certainty or accuracy of the predictions.

Kriging assumes that the distance or direction between sample points reflects a spatial correlation that can be used to explain variations in the surface. Kriging fits a mathematical function to a specified number of points, or all points within a specified radius, to determine the output value for each location. Kriging is a multistep process; it includes exploratory statistical analysis of the data, variogram modeling, creating the surface, and (optionally) exploring a variance surface. Kriging is most appropriate when you know there is a spatially correlated distance or directional bias in the data. It is often used in soil science and geology.

The Kriging formula

Kriging is similar to IDW in that it weights the surrounding measured values to derive a prediction for an unmeasured location. The general formula for both interpolators is formed as a weighted sum of the data:

Interpolation kriging

where:

Z(si) = the measured value at the ith location.

λi = an unknown weight for the measured value at the ith location.

s0 = the prediction location.

N = the number of measured values.

In IDW, the weight λi, depends solely on the distance to the prediction location. However, in Kriging the weights are based not only on the distance between the measured points and the prediction location, but also on the overall spatial arrangement among the measured points. To use the spatial arrangement in the weights, the spatial autocorrelation must be quantified. Thus, in ordinary Kriging, the weight, λi, depends on a fitted model to the measured points, the distance to the prediction location, and the spatial relationships among the measured values around the prediction location. The following sections will discuss how the general Kriging formula is used to create a map of the prediction surface and a map of the accuracy of the predictions.

Creating a prediction surface map with Kriging

To make a prediction with the Kriging interpolation method, two tasks are necessary:

To realize these two tasks, Kriging goes through a two-step process:

These two distinct tasks have cause it to be said that Kriging uses the data twice: the first time to estimate the spatial autocorrelation of the data and the second time to make the predictions.

Variography

Fitting a model, or spatial modeling, is also known as structural analysis, or variography. When spatially modeling the structure of measured points, you begin with a graph of the empirical semivariogram, computed as:

Semivariogram(distance h) = 0.5 * average[ (value at location i – value at location j)2]

for all pairs of locations separated by distance h. The formula involves calculating the difference squared between the values of the paired locations. The image below shows the pairing of one point (the red point) with all other measured locations. This process continues for each measured point.


Interpolation kriging

Each pair of locations has a unique distance, and there are often many pairs of points. To plot all pairs quickly becomes unmanageable. Instead of plotting each pair, the pairs are grouped into lag bins. For example, compute the average semivariance for all pairs of points that are greater than 40 meters apart but less than 50 meters. The empirical semivariogram is a graph of the averaged semivariogram values on the y-axis and the distance (or lag) on the x-axis (see the following illustration).


Interpolation kriging

Spatial autocorrelation quantifies a basic principle of geography: things that are closer are more alike than things farther apart. Thus, pairs of locations that are closer (far left on the x-axis of the semivariogram cloud) should have more similar values (low on the y-axis of the semivariogram cloud). As pairs of locations become farther apart (moving to the right on the x-axis of the semivariogram cloud), they should become more dissimilar and have a higher squared difference (move up on the y-axis of the semivariogram cloud).

Fitting a model to the empirical semivariogram

The next step is to fit a model to the points forming the empirical semivariogram. Semivariogram modeling is a key step between spatial description and spatial prediction. The main application of Kriging is the prediction of attribute values at unsampled locations. You have seen how the empirical semivariogram provides information on the spatial autocorrelation of datasets. However, it does not provide information for all possible directions and distances. For this reason, and to ensure that kriging predictions have positive kriging variances, it is necessary to fit a model (that is, a continuous function or curve) to the empirical semivariogram. Abstractly, this is similar to regression analysis, where a continuous line or curve is fitted to the data points.

To fit a model to the empirical semivariogram, select some function that serves as your model—for example, a spherical type that rises at first and then levels off for larger distances beyond a certain range (see below). There are deviations of the points on the empirical semivariogram from the model; some points are above the model curve, and some points are below. But if you add the distance each point is above the line and add the distance each point is below the line, the two values should be similar. There are many semivariogram models to choose from.

Semivariogram models

ArcGIS Spatial Analyst provides the following functions from which to choose to model the empirical semivariogram:

The selected model influences the prediction of the unknown values, particularly when the shape of the curve near the origin differs significantly. The steeper the curve is near the origin, the more influence the closest neighbors will have on the prediction. As a result, the output surface will be less smooth. Each model is designed to fit different types of phenomenon more accurately.

The spherical model

This model shows a progressive decrease of spatial autocorrelation (equivalently, an increase of semivariance) until some distance, beyond which autocorrelation is zero. The spherical model is one of the most commonly used models.

The following illustrates two common models and identify how the functions differ.


Intrpolation kriging

The exponential model

This model is applied when spatial autocorrelation decreases exponentially with increasing distance. Here the autocorrelation disappears completely only at an infinite distance. The exponential model is also a commonly used model. The choice of which model to use is based on the spatial autocorrelation of the data and on prior knowledge of the phenomenon.


Interpolation kriging

Understanding a semivariogram—the range, sill, and nugget

As previously stated, the semivariogram depicts the spatial autocorrelation of the measured sample points. Because of a basic principle of geography (things that are closer are more alike), measured points that are close will generally have a smaller difference squared than those farther apart. Once each pair of locations is plotted (after being binned), a model is fit through them. Range, sill, and nugget are commonly used to describe these models.

The range and sill

When you look at the model of a semivariogram, you will notice that at a certain distance, the model levels out. The distance where the model first flattens out is known as the range. Sample locations separated by distances closer than the range are spatially autocorrelated, whereas locations farther apart than the range are not.


Interpolation kriging

The value at which the semivariogram model attains the range (the value on the y-axis) is called the sill. The partial sill is the sill minus the nugget (see the following section).

The nugget

Theoretically, at zero separation distance (for example, lag = 0), the semivariogram value is zero. However, at an infinitely small separation distance, the semivariogram often exhibits a nugget effect, which is some value greater than zero. If the semivariogram model intercepts the y-axis at 2, then the nugget is 2.

The nugget effect can be attributed to measurement errors or spatial sources of variation at distances smaller than the sampling interval (or both). Measurement error occurs because of the error inherent in measuring devices. Natural phenomena can vary spatially over a range of scales. Variation at microscales smaller than the sampling distances will appear as part of the nugget effect. Before collecting data, it is important to gain some understanding of the scales of spatial variation that you are interested in.

Making a prediction

After you have uncovered the dependence or autocorrelation in your data (see Variography), and finished with the first use of the data—using the spatial information in the data to compute distances and model the spatial autocorrelation—you can make a prediction using the fitted model. Thereafter, the empirical semivariogram is set aside.

You now use the data again to make predictions. Like IDW interpolation, Kriging forms weights from surrounding measured values to predict unmeasured locations. As with IDW interpolation, the measured values closest to the unmeasured locations have the most influence. However, the Kriging weights for the surrounding measured points are more sophisticated than those of IDW. IDW uses a simple algorithm based on distance, but Kriging weights come from a semivariogram that was developed by looking at the spatial nature of the data. To create a continuous surface of the phenomenon, predictions are made for each location (cell centers) in the study area based on the semivariogram and the spatial arrangement of measured values that are nearby.

Search radius

A basic principle of geography says that things that are close to one another are more alike than things farther away. Using this principle, you can establish your search radius or neighborhood by assuming that as the locations get farther from the prediction location, the measured values will have less spatial autocorrelation with the unknown value for the location you are predicting. Thus, you can eliminate those locations that are farther away with little influence. Not only is there less relationship with locations that are farther away, but it is possible that the locations that are farther away may have a negative influence if they are located in an area much different from the prediction location.

Search radius controls computational speed. The smaller the search radius, the faster the predictions can be made. As a result, it is common practice to limit the number of points that are used when making a prediction by specifying a search neighborhood. The specified shape of the neighborhood restricts how far, and where to look for the measured values to be used in the prediction. Other neighborhood parameters restrict the point locations that will be used within that shape so, for example, you can define the maximum and minimum number of measured points to use within the neighborhood.

Using the configuration of the valid points within the specified search radius around the prediction location in conjunction with the model fit to the semivariogram, you can determine the weights for the measured locations. From the weights and the values, you can make a prediction for the unknown value at the prediction location.

ArcGIS Spatial Analyst neighborhood types

There are two neighborhood types in ArcGIS Spatial Analyst—fixed and variable

Kriging methods

There are two Kriging methods—ordinary and universal

See Also